Optimizing BIT1, a Particle-in-Cell Monte Carlo Code, with OpenMP/OpenACC and GPU Acceleration

  • Conference paper
  • First Online:
Computational Science – ICCS 2024 (ICCS 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14832))

Included in the following conference series:

  • 111 Accesses

Abstract

On the path toward develo** the first fusion energy devices, plasma simulations have become indispensable tools for supporting the design and development of fusion machines. Among these critical simulation tools, BIT1 is an advanced Particle-in-Cell code with Monte Carlo collisions, specifically designed for modeling plasma-material interaction and, in particular, analyzing the power load distribution on tokamak divertors. The current implementation of BIT1 relies exclusively on MPI for parallel communication and lacks support for GPUs. In this work, we address these limitations by designing and implementing a hybrid, shared-memory version of BIT1 capable of utilizing GPUs. For shared-memory parallelization, we rely on OpenMP and OpenACC, using a task-based approach to mitigate load-imbalance issues in the particle mover. On an HPE Cray EX computing node, we observe an initial performance improvement of approximately 42%, with scalable performance showing an enhancement of about 38% when using 8 MPI ranks. Still relying on OpenMP and OpenACC, we introduce the first version of BIT1 capable of using GPUs. We investigate two different data movement strategies: unified memory and explicit data movement. Overall, we report BIT1 data transfer findings during each PIC cycle. Among BIT1 GPU implementations, we demonstrate performance improvement through concurrent GPU utilization, especially when MPI ranks are assigned to dedicated GPUs. Finally, we analyze the performance of the first BIT1 GPU porting with the NVIDIA Nsight tools to further our understanding of BIT1’s computational efficiency for large-scale plasma simulations, capable of exploiting current supercomputer infrastructures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ayguadé, E., et al.: The design of OpenMP tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2008)

    Article  Google Scholar 

  2. Chandrasekaran, S., et al.: OpenACC for Programmers: Concepts and Strategies. Addison-Wesley Professional (2017)

    Google Scholar 

  3. Chien, S.W., et al.: sputniPIC: an implicit particle-in-cell code for multi-GPU systems. In: 2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 149–156. IEEE (2020)

    Google Scholar 

  4. Derouillat, J., et al.: SMILEI: a collaborative, open-source, multi-purpose particle-in-cell code for plasma simulation. Comput. Phys. Commun. 222, 351–373 (2018)

    Article  MathSciNet  Google Scholar 

  5. Liu, F., et al.: Parallel Cholesky factorization for banded matrices using OpenMP tasks. In: Euro-Par 2023: Parallel Processing. LNCS, vol. 14100, pp. 725–739. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-39698-4_49

  6. Markidis, S., et al.: The EPiGRAM project: preparing parallel programming models for Exascale. In: Taufer, M., Mohr, B., Kunkel, J.M. (eds.) ISC High Performance 2016. LNCS, vol. 9945, pp. 56–68. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46079-6_5

    Chapter  Google Scholar 

  7. Peng, I.B., et al.: Acceleration of a Particle-in-Cell code for space plasma simulations with OpenACC. In: EGU General Assembly Conference Abstracts, p. 1276 (2015)

    Google Scholar 

  8. Tskhakaya, D., et al.: Optimization of pic codes by improved memory management. J. Comput. Phys. 225(1), 829–839 (2007)

    Article  Google Scholar 

  9. Tskhakaya, D., et al.: PIC/MC code bit1 for plasma simulations on HPC. In: 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp. 476–481. IEEE (2010)

    Google Scholar 

  10. Vay, J.L., et al.: Warp-X: a new Exascale computing platform for beam-plasma simulations. Nucl. Instrum. Methods Phys. Res., Sect. A 909, 476–479 (2018)

    Article  Google Scholar 

  11. Verboncoeur, J.P., et al.: Simultaneous potential and circuit solution for 1D bounded plasma particle simulation codes. J. Comput. Phys. 104(2), 321–328 (1993)

    Article  MathSciNet  Google Scholar 

  12. Wei, Y., et al.: Performance and portability studies with OpenACC accelerated version of GTC-P. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 13–18. IEEE (2016)

    Google Scholar 

  13. Williams, J.J., et al.: Leveraging HPC profiling & tracing tools to understand the performance of particle-in-cell monte Carlo simulations. Euro-Par 2023: Parallel Processing Workshops, ar**v preprint ar**v:2306.16512 (2023)

Download references

Acknowledgments

Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Sweden, Finland, Germany, Greece, France, Slovenia, Spain, and Czech Republic under grant agreement No 101093261.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeremy J. Williams .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Williams, J.J., Liu, F., Tskhakaya, D., Costea, S., Podolnik, A., Markidis, S. (2024). Optimizing BIT1, a Particle-in-Cell Monte Carlo Code, with OpenMP/OpenACC and GPU Acceleration. In: Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14832. Springer, Cham. https://doi.org/10.1007/978-3-031-63749-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-63749-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-63748-3

  • Online ISBN: 978-3-031-63749-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation