Leveraging HPC Profiling and Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations

  • Conference paper
  • First Online:
Euro-Par 2023: Parallel Processing Workshops (Euro-Par 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14351))

Included in the following conference series:

Abstract

Large-scale plasma simulations are critical for designing and develo** next-generation fusion energy devices and modeling industrial plasmas. BIT1 is a massively parallel Particle-in-Cell code designed for specifically studying plasma material interaction in fusion devices. Its most salient characteristic is the inclusion of collision Monte Carlo models for different plasma species. In this work, we characterize single node, multiple nodes, and I/O performances of the BIT1 code in two realistic cases by using several HPC profilers, such as perf, IPM, Extrae/Paraver, and Darshan tools. We find that the BIT1 sorting function on-node performance is the main performance bottleneck. Strong scaling tests show a parallel performance of 77% and 96% on 2,560 MPI ranks for the two test cases. We demonstrate that communication, load imbalance and self-synchronization are important factors impacting the performance of the BIT1 on large-scale runs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afzal, A., Hager, G., Wellein, G.: Analytic modeling of idle waves in parallel programs: communication, cluster topology, and noise impact. In: Chamberlain, B.L., Varbanescu, A.-L., Ltaief, H., Luszczek, P. (eds.) ISC High Performance 2021. LNCS, vol. 12728, pp. 351–371. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78713-4_19

    Chapter  Google Scholar 

  2. Fuerlinger, K., et al.: Effective performance measurement at petascale using IPM. In: 2010 IEEE 16th International Conference on Parallel and Distributed Systems, pp. 373–380. IEEE (2010)

    Google Scholar 

  3. KTH: Cachetest (2023). https://gits-15.sys.kth.se/jjwil/BIT-Code-Tests, updated: 2023-05-04

  4. Markidis, S., et al.: Idle waves in high-performance computing. Phys. Rev. E 91(1), 013306 (2015)

    Article  Google Scholar 

  5. Peng, I.B., et al.: The cost of synchronizing imbalanced processes in message passing systems. In: 2015 IEEE Cluster, pp. 408–417. IEEE (2015)

    Google Scholar 

  6. Peng, I.B., et al.: Exploring application performance on emerging hybrid-memory supercomputers. In: 2016 IEEE International Conference on High Performance Computing and Communications), pp. 473–480. IEEE (2016)

    Google Scholar 

  7. Peng, I.B., et al.: Idle period propagation in message-passing applications. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, pp. 937–944. IEEE (2016)

    Google Scholar 

  8. Ristov, S., et al.: Superlinear speedup in HPC systems: Why and when? In: 2016 FedCSIS, pp. 889–898. IEEE (2016)

    Google Scholar 

  9. Servat, H., et al.: Framework for a productive performance optimization 39(8), 336–353 (2013)

    Google Scholar 

  10. Snyder, S., et al.: Modular HPC I/O characterization with Darshan. In: 2016 5th Workshop on Extreme-Scale Programming Tools (ESPT), pp. 9–17. IEEE (2016)

    Google Scholar 

  11. Tskhakaya, D., et al.: PIC/MC code BIT1 for plasma simulations on hpc. In: 2010 18th Euromicro, pp. 476–481. IEEE (2010)

    Google Scholar 

  12. Tskhakaya, D., Schneider, R.: Optimization of PIC codes by improved memory management. J. Comput. Phys. 225(1), 829–839 (2007)

    Article  Google Scholar 

  13. Verboncoeur, J., et al.: Simultaneous potential and circuit solution for 1d bounded plasma particle simulation codes. J. Comput. Phys. 104(2), 321–328 (1993)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

Funded by the European Union. This work has received funding from the European High Performance Computing Joint Undertaking (JU) and Sweden, Finland, Germany, Greece, France, Slovenia, Spain, and Czech Republic under grant agreement No 101093261.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeremy J. Williams .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Williams, J.J., Tskhakaya, D., Costea, S., Peng, I.B., Garcia-Gasulla, M., Markidis, S. (2024). Leveraging HPC Profiling and Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations. In: Zeinalipour, D., et al. Euro-Par 2023: Parallel Processing Workshops. Euro-Par 2023. Lecture Notes in Computer Science, vol 14351. Springer, Cham. https://doi.org/10.1007/978-3-031-50684-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50684-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50683-3

  • Online ISBN: 978-3-031-50684-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation