Abstract
Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA’s GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.
Similar content being viewed by others
References
Amdahl, G. 1967. Validity of the single processor approach to achieving large-scale computing capabilities. Proceedings of the American Federation of Information Processing Studies (AFIPS) Conference 30, 483–485.
Computational Genetics Laboratory. 2010. Supplementary Material. http://sourceforge.net/projects/all-pairsgpu
Greene, C.S., Sinnott-Armstrong, N.A., Himmelstein sD.S., Park, P.J., Moore, J.H., Harris, B.T. 2010. Multifactor dimensionality reduction for graphics processing units enables genome-wide testing of epistasis in sporadic ALS. Bioinformatics 26, 694–695.
Harris, M. 2009. Optimizing parallel reduction in CUDA. NVIDIA White Paper. http://developer.download.nvidia.com/compute/cuda/11/Website/projects/reduction/doc/reduction.pdf
Hussong, R., Gregorius, B., Tholey, A., Hildebrandt, A. 2009. Highly accelerated feature detection in proteomics data sets using modern graphics processing units. Bioinformatics 25, 1937–1943.
Manavski, S.A., Valle, G. 2008. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics 9, S10.
NVIDIA Corporation. 2009a. NVIDIA CUDA programming guide. Version 2.3.1.
NVIDIA Corporation. 2009b. CUDA Occupancy Calculator. http://developer.download.nvidia.com/compute/cuda/CUDAOccupancycalculator.xls
Pinto, N., Doukhan, D., DiCarlo, J.J., Cox, D.D. 2009. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Computational Biology 5, e1000579.
Schatz, M.C., Trapnell, C., Delcher, A.L., Varshney, A. 2007. High-throughput sequence alignment using graphics processing units. BMC Bioinformatics 8, 474.
Sinnott-Armstrong, N.A., Greene, C.S., Cancare, F., Moore, J.H. 2009. Accelerating epistasis analysis in human genetics with consumer graphics hardware. BMC Research Notes 2, 149.
Suchard, M.A., Rambaut, A. 2009. Many-core algorithms for statistical phylogenetics. Bioinformatics 25, 1370–1376.
Author information
Authors and Affiliations
Corresponding author
Additional information
These authors contributed equally to this work.
Rights and permissions
About this article
Cite this article
Payne, J.L., Sinnott-Armstrong, N.A. & Moore, J.H. Exploiting graphics processing units for computational biology and bioinformatics. Interdiscip Sci Comput Life Sci 2, 213–220 (2010). https://doi.org/10.1007/s12539-010-0002-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-010-0002-4