Abstract
Multigrid methods are a family of mathematical methods governing linear time and storage complexity for solving several elliptic partial differential equations. The logarithmically decaying resolution of the grids in the multigrid hierarchy poses a challenge to achieving high parallel efficiency on highly heterogeneous systems. At the same time, supercomputers have become increasingly heterogeneous with the advent of general-purpose graphics processing units.
This paper presents a highly optimized geometric multigrid Poisson solver that leverages multiple general-purpose graphics processing units with OpenMP target offloading and tasking.
We demonstrate that advanced OpenMP features, such as task dependencies and peer-to-peer data transfers, can decrease the amount of idle time on the accelerators and thereby increase the parallel efficiency for a multigrid application.
Weak scaling results are presented for two high-performance computing systems with NVIDIA and AMD accelerators. We use four NVIDIA Tesla GV100 general-purpose graphics processing units to achieve a parallel efficiency of 94 percent for a solver based on V-cycles with seven multigrid levels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
AMD: Introducing AMD CDNA 2 Architecture. https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf (2022). Accessed 23 Mar 2023
Bingham, H., Larsen, P., Barker, A.: Computational Fluid Dynamics. Technical University of Denmark, Kongens Lyngby, Denmark (2020)
Center for Science: GPU nodes - LUMI-G. https://docs.lumi-supercomputer.eu/hardware/lumig/ (2023). Accessed 16 May 2023
DTU Computing Center: DTU Computing Center resources (2022). https://doi.org/10.48714/DTU.HPC.0001
Hewlett Packard Enterprise: HPE Apollo 6500 Gen10 System/HPE ProLiantXL270d Gen10 Server User Guide. https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US &docId=a00045705en_us (2019). Accessed 16 May 2023
Jacobsen, D., Senocak, I.: A full-depth amalgamated parallel 3d geometric multigrid solver for GPU clusters. In: Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition (2011). https://doi.org/10.2514/6.2011-946
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004. CGO 2004, IEEE Computer Society, USA (2004)
LLVM: Support, Getting Involved, and FAQ. https://openmp.llvm.org/SupportAndFAQ.html#build-amdgpu-offload-capable-compiler (2023). Accessed 16 Feb 2023
Madsen, H.: Time Series Analysis. Chapman and Hall (2000)
McCormick, S.: Multilevel Projection Methods for Partial Differential Equations. SIAM, Denver, Colorado (1992)
Onodera, N., Idomura, Y., Hasegawa, Y., Yamashita, S., Shimokawabe, T., Aoki, T.: GPU Acceleration of multigrid preconditioned conjugate gradient solver on block-structured cartesian grid. In: The International Conference on High Performance Computing in Asia-Pacific Region, pp. 120–128. HPC Asia 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3432261.3432273
OpenMP Architecture Review Board: OpenMP Application Programming Interface - Version 5.0 November 2018. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf (2018). Accessed 18 May 2023
Rydahl, A., Gammelmark, M., Karlsson, S.: Feasibility Studies in Multi-GPU Target Offloading. In: Klemm, M., de Supinski, B.R., Klinkenberg, J., Neth, B. (eds.) OpenMP in a Modern World: From Multi-device Support to Meta Programming. IWOMP 2022. Lecture Notes in Computer Science, vol. 13527, pp. 81–93. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15922-0_6
Top 500: November 2022. https://top500.org/lists/top500/2022/11/ (2022). Accessed 18 May 2023
Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. London, 1 edn. (2001)
Vicherek, J.: Introduction of LUMI supercomputer. https://events.it4i.cz/event/160/attachments/457/1717/lumi-intro.pdf (2023). Accessed 16 May 2023
Acknowledgment
This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732. We acknowledge the Danish e-Infrastructure Cooperation (DeiC), Denmark, for awarding this project access to the LUMI supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium through DeiC, Denmark, Compiler development (DeiC-DTU-N5-20230033). Lastly, we acknowledge DCC [4] for providing access to compute resources.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rydahl, A., Karlsson, S. (2023). Improving a Multigrid Poisson Solver with Peer-to-Peer Communication and Task Dependencies. In: McIntosh-Smith, S., Klemm, M., de Supinski, B.R., Deakin, T., Klinkenberg, J. (eds) OpenMP: Advanced Task-Based, Device and Compiler Programming. IWOMP 2023. Lecture Notes in Computer Science, vol 14114. Springer, Cham. https://doi.org/10.1007/978-3-031-40744-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-40744-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40743-7
Online ISBN: 978-3-031-40744-4
eBook Packages: Computer ScienceComputer Science (R0)