Fault Tolerant Molecular-Continuum Flow Simulation

  • Conference paper
  • First Online:
High Performance Computing in Science and Engineering '22 (HPCSE 2022)

Abstract

Molecular-continuum simulations couple molecular dynamics (MD) and computational fluid dynamics (CFD) simulations in a domain decomposition sense to assess fluid flow, e.g., in process engineering applications, at the nanoscale. Running these simulations on extreme-scale supercomputers, an issue consists in single compute cores or nodes failing due to hardware- or software-sided errors. This imposes a challenge to robustness of numerical simulations and, as such, also to molecular-continuum systems. We introduce a fault tolerance method in our macro-micro-coupling tool (MaMiCo) that has been developed in the past as molecular-continuum simulation software solution. With MaMiCo leveraging ensemble simulations to cope with statistical errors in the MD solutions, we extended the ensemble approach to recognize failing MPI processes and react to these failures. Once a failure is encountered, the affected MD simulations are removed from these MPI processes and relaunched on well-operating MPI process groups. We detail our approach and report scalability results for our approach, achieved on the supercomputer HAWK at HLRS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/HSU-HPC/MaMiCo.

References

  1. Acun, B., Hardy, D.J., Kale, L.V., Li, K., Phillips, J.C., Stone, J.E.: Scalable molecular dynamics with NAMD on the Summit system. IBM J. Res. Dev. 62(6), 4:1–4:9 (2018)

    Google Scholar 

  2. Gupta, S., Patel, T., Engelmann, C., Tiwari, D.: Failures in large scale systems: long-term measurement, analysis, and implications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’17, New York, NY, USA, 2017. Association for Computing Machinery

    Google Scholar 

  3. Jafari, V., Wittmer, N., Neumann, P.: Massively Parallel Molecular-Continuum Flow Simulation with Error Control and Dynamic Ensemble Handling. HPCAsia2022, pp. 52–60. Association for Computing Machinery, New York, NY, USA (2022)

    Google Scholar 

  4. Jarmatz, P., Wittenberg, H., Jafari, V., Das Sharma, A., Maurer, F., Wittmer, N., Neumann, P.: MaMiCo 2.0: An enhanced open-source framework for high-performance molecular-continuum flow simulation (2022). Submitted

    Google Scholar 

  5. Laguna, I., Richards, D.F., Gamblin, T., Schulz, M., de Supinski, B.R.: Evaluating User-Level Fault Tolerance for MPI Applications. In: Proceedings of the 21st European MPI Users’ Group Meeting, EuroMPI/ASIA ’14, page 57-62. Association for Computing Machinery, New York, NY, USA (2014)

    Google Scholar 

  6. Mohamed, K.M., Mohamad, A.A.: A review of the development of hybrid atomistic-continuum methods for dense fluids. Microfluids Nanofluidics 8, 283–302 (2010)

    Article  Google Scholar 

  7. Neumann, P., Bian, X.: MaMiCo: transient Multi-Instance Molecular-Continuum Flow Simulation on Supercomputers. Comput. Phys. Commun. 220, 390–402 (2017)

    Article  Google Scholar 

  8. Neumann, P., Flohr, H., Arora, R., Jarmatz, P., Tchipev, N., Bungartz, H.-J.: MaMiCo: software design for parallel molecular-continuum flow simulations. Comput. Phys. Commun. 200, 324–335 (2016)

    Article  Google Scholar 

  9. Niethammer, C., Becker, S., Bernreuther, M., Buchholz, M., Eckhardt, W., Heinecke, A., Werth, S., Bungartz, H.-J., Glass, C.W., Hasse, H., Vrabec, J., Horsch, M.: ls1 mardyn: the massively parallel molecular dynamics code for large systems. J. Chem. Theory Comput. 10(10), 4455–4464 (2014)

    Article  Google Scholar 

  10. Ossyra, J., Sedova, A., Tharrington, A., Noé, F., Clementi, C., Smith, J.C.: porting adaptive ensemble molecular dynamics workflows to the summit supercomputer. In: High Performance Computing, pp. 397–417. Springer International Publishing, Cham (2019)

    Google Scholar 

  11. Thompson, A.P., Aktulga, H.M., Berger, R., Bolintineanu, D.S., Brown, W.M., Crozier, P.S., in ’t Veld, P.J.,  Kohlmeyer, A., Moore, S.G., Nguyen, T.D., Shan, R., Stevens, M.J., Tranchida, J., Trott, C., Plimpton, S.J.: LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 10817 (2022)

    Google Scholar 

  12. Wittenberg, H., Neumann, P.: Transient two-way molecular-continuum coupling with OpenFOAM and MaMiCo: a sensitivity study. Computation 9(12), 128 (2021)

    Article  Google Scholar 

  13. Yajnik, S., Jha, N.K.: Synthesis of fault tolerant architectures for molecular dynamics. In: Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS ’94, vol. 4, pp. 247–250 (1994)

    Google Scholar 

Download references

Acknowledgements

We thank HLRS and the Gauss Centre for Supercomputing for the provision of computational resources (project GCS-MDDC). We further acknowledge funding for MaMiCo software developments by the project Macro/Micro-Simulation of Phase Decomposition in the Transcritical Regime (MaST) of the Digitalization and Technology Research Center of the Bundeswehr (dtec.bw) and the HSU-internal research funding (project Resilience and Dynamic Noise Reduction at Exascale for Multiscale Simulation Coupling).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philipp Neumann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jafari, V. et al. (2024). Fault Tolerant Molecular-Continuum Flow Simulation. In: Nagel, W.E., Kröner, D.H., Resch, M.M. (eds) High Performance Computing in Science and Engineering '22. HPCSE 2022. Springer, Cham. https://doi.org/10.1007/978-3-031-46870-4_30

Download citation

Publish with us

Policies and ethics

Navigation