Abstract
In this paper, we describe the domain decomposing strategy of finite-difference to implement and optimize GPU codes in solving 2-D N-S equations. To satisfy GPU architecture, our algorithms emphasize on the decomposition strategy and the maximum of exploiting the GPU memory hierarchy so that high rate of speedup can be expected. Tests on two CFD cases, respectively being cavity flow and aerofoil RAE 2822, are used. For cavity flow, we ran our simulation both on CUDA and OpenCL platform and witnessed 30–60x speedup. In aerofoil, we used 6–60 GPU devices and get speedup of 5–29 times depending on the grid size and number of devices used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson WK, Bonhaus DL (2009) Airfoil design on unstructured grids for turbulent flows. AIAA J 37(2):185–191
Baldwin BS, Lomax H (1978) Thin layer approximation and algebraic model for separated turbulent Flows. AIAA 78–257.
Brandvik T, Pullan G (2007) Acceleration of a two-dimensional euler flow solver using commodity graphics hardware. J Proc Inst Mech Eng Part C: J Mech Eng Sci 221:1745–1748
Jespersen DC (2009) Acceleration of a CFD code with a GPU. NAS Technical report NAS-09-003.
Toro EF (1999) Riemann solvers and numerical methods for fluid dynamics-a practical introduction. Springer, Berlin
Sanders J, Kandrot E (2011) CUDA by example: an introduction to general purpose GPU programming. Addison-Wesley, Boston
Khronos openCL working group (2008) The openCL specication, V1.0.
NVIDIA Corporation (2007) Compute unified device architecture programming guide. http://www.nvidia.com
Tingxing D, **nliang L, Sen L (2010) Acceleration of computational fluid dynamic codes on GPU. In: 8th Asian computational fluid dynamics conference.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Appendix A
Appendix A
In this appendix, the code shows how GPU devices are assigned to different MPI processors. To implement domain decomposition strategy, GPU devices must be assigned in continuous numbers in x and y axis so that we can dispatch tasks according to their position.
![figure a1](http://media.springernature.com/lw685/springer-static/image/chp%3A10.1007%2F978-3-642-16405-7_27/MediaObjects/214239_1_En_27_Figa1_HTML.gif)
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Li, S., Li, X., Wang, L., Lu, Z., Chi, X. (2013). Accelerating 2-Dimensional CFD on Multi-GPU Supercomputer. In: Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds) GPU Solutions to Multi-scale Problems in Science and Engineering. Lecture Notes in Earth System Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16405-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-16405-7_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16404-0
Online ISBN: 978-3-642-16405-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)