Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study

Liu, Peini; Guitart, Jordi

doi:10.1007/s11227-020-03518-1

Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study

Published: 30 November 2020

Volume 77, pages 6273–6312, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

766 Accesses
11 Citations
Explore all metrics

Abstract

The high-performance computing (HPC) community has recently started to use containerization to obtain fast, customized, portable, flexible, and reproducible deployments of their workloads. Previous work showed that deploying an HPC workload into a single container can keep bare-metal performance. However, there is a lack of research on multi-container deployments that partition the processes belonging to each application into different containers. Partitioning HPC applications has shown to improve their performance on virtual machines by allowing to set affinity to a non-uniform memory access (NUMA) domain for each of them. Consequently, it is essential to understand the performance implications of distinct multi-container deployment schemes for HPC workloads, focusing on the impact of the container granularity and its combination with processor and memory affinity. This paper presents a systematic performance comparison and analysis of multi-container deployment schemes for HPC workloads on a single-node platform, which considers different containerization technologies (including Docker and Singularity), two different platform architectures (UMA and NUMA), and two application subscription modes (exact subscription and over-subscription). Our results indicate that finer-grained multi-container deployments, on the one side, can benefit the performance of some applications with low interprocess communication, especially in over-subscribed scenarios and when combined with affinity, but, on the other side, they can incur some performance degradation for communication-intensive applications when using containerization technologies that deploy isolated network namespaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 14

Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

Article Open access 16 November 2021

Containers in HPC: a survey

Article 27 October 2022

Exploring the support for high performance applications in the container runtime environment

Article Open access 08 January 2018

Notes

References

Alam S, Barrett R, Bast M, Fahey MR, Kuehn J, McCurdy C, Rogers J, Roth P, Sankaran R, Vetter JS et al (2008) Early evaluation of IBM BlueGene/P. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC’08). IEEE, pp 1–12. https://doi.org/10.1109/SC.2008.5214725
Arango C, Dernat R, Sanabria J (2017) Performance evaluation of container-based virtualization for high performance computing environments. CoRR abs/1709.10140
Azab A (2017) Enabling docker containers for high-performance and many-task computing. In: Proceedings of the 2017 IEEE International Conference on Cloud Engineering (IC2E), pp 279–285. https://doi.org/10.1109/IC2E.2017.52
Bacik J Cpu scheduler imbalance with cgroups. https://josefbacik.github.io/kernel/scheduler/cgroup/2017/07/24/scheduler-imbalance.html
Banerjee A, Mehta R, Shen Z (2015) NUMA aware I/O in virtualized systems. In: Proceedings of the 2015 IEEE 23rd annual symposium on high-performance interconnects, pp 10–17 (2015). https://doi.org/10.1109/HOTI.2015.17
Bermejo B, Juiz C (2020) On the classification and quantification of server consolidation overheads. J Supercomput. https://doi.org/10.1007/s11227-020-03258-2
Article Google Scholar
Cheng Y, Chen W, Chen X, Xu B, Zhang S (2013) A user-level numa-aware scheduler for optimizing virtual machine performance. In: Revised selected papers of the 10th international symposium on advanced parallel processing technologies, APPT 2013, vol 8299, pp 32–46. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45293-2_3
Chung MT, Quang-Hung N, Nguyen M, Thoai N (2016) Using docker in high performance computing applications. In: Proceedings of the 2016 IEEE Sixth International Conference on Communications and Electronics (ICCE), pp 52–57. https://doi.org/10.1109/CCE.2016.7562612
Felter W, Ferreira A, Rajamony R, Rubio J (2015) An updated performance comparison of virtual machines and Linux containers. In: Proceedings of the 2015 IEEE international symposium on performance analysis of systems and software (ISPASS). IEEE, pp 171–172. https://doi.org/10.1109/ISPASS.2015.7095802
Google: Cgroups-cpus. https://kernel.googlesource.com/pub/scm/linux/kernel/git/glommer/memcg/+/cpu_stat/Documentation/cgroups/cpu.txt
Halácsy G, Ádám Mann Z (2018) Optimal energy-efficient placement of virtual machines with divisible sizes. Inf Process Lett 138:51–56. https://doi.org/10.1016/j.ipl.2018.06.003
Article MathSciNet MATH Google Scholar
HPC advisor council: HPCC performance benchmark and profiling (2015). https://hpcadvisorycouncil.com/pdf/HPCC_Analysis_and_Profiling_Intel_E5-2697v3.pdf
HPC wire: Sylabs releases singularity 3.0 container platform; Cites AI Support (2018). https://www.hpcwire.com/2018/10/08/sylabs-releases-singularity-3-0-container-platform-cites-ai-support/
Ibrahim KZ, Hofmeyr S, Iancu C (2011) Characterizing the performance of parallel applications on multi-socket virtual machines. In: Proceedings of the 2011 11th IEEE/ACM international symposium on cluster, cloud and grid computing. IEEE, pp 1–12. https://doi.org/10.1109/CCGrid.2011.50
Ibrahim KZ, Hofmeyr S, Iancu C (2014) The case for partitioning virtual machines on multicore architectures. IEEE Trans Parallel Distrib Syst 25(10):2683–2696. https://doi.org/10.1109/TPDS.2013.242
Article Google Scholar
Iosup A, Ostermann S, Yigitbasi MN, Prodan R, Fahringer T, Epema D (2011) Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Parallel Distrib Syst 22(6):931–945. https://doi.org/10.1109/TPDS.2011.66
Article Google Scholar
Jha DN, Garg S, Jayaraman PP, Buyya R, Li Z, Morgan G, Ranjan R (2019) A study on the evaluation of HPC microservices in containerized environment. Concurr Comput. https://doi.org/10.1002/cpe.5323
Article Google Scholar
Jha DN, Garg S, Jayaraman PP, Buyya R, Li Z, Ranjan R (2018) A holistic evaluation of docker containers for interfering microservices. In: Proceedings of the 2018 IEEE International Conference on Services Computing (SCC), pp 33–40. https://doi.org/10.1109/SCC.2018.00012
Kuity A, Peddoju SK (2017) Performance evaluation of container-based high performance computing ecosystem using OpenPOWER. In: Kunkel JM, Yokota R, Taufer M, Shalf J (eds) High performance computing, ISC high performance 2017, Lecture notes in computer science. Springer International Publishing, Cham, vol 10524, pp 290–308. https://doi.org/10.1007/978-3-319-67630-2_22
Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: scientific containers for mobility of compute. PLoS ONE 12(5):e0177459. https://doi.org/10.1371/journal.pone.0177459
Article Google Scholar
Lozi JP, Lepers B, Funston J, Gaud F, Quéma V, Fedorova A (2016) The Linux scheduler: a decade of wasted cores. In: Proceedings of the Eleventh European Conference on Computer Systems, EuroSys’16. Association for Computing Machinery. https://doi.org/10.1145/2901318.2901326
Luszczek PR, Bailey DH, Dongarra JJ, Kepner J, Lucas RF, Rabenseifner R, Takahashi D (2006) The HPC challenge (HPCC) benchmark suite. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC’06). https://doi.org/10.1145/1188455.1188677
Luszczek P, Koester D (2005) HPC challenge v1.x benchmark suite. SC’05 Tutorial, Seattle, Washington. http://icl.cs.utk.edu/news_pub/submissions/HPCChallengeTutorialDPKPL22Nov2005.pdf
Maliszewski AM, Griebler D, Schepke C, Ditter A, Fey D, Fernandes LG (2018) The NAS benchmark kernels for single and multi-tenant cloud instances with LXC/KVM. In: Proceedings of the 2018 International Conference on High Performance Computing Simulation (HPCS), pp 359–366. https://doi.org/10.1109/HPCS.2018.00066
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1):50–60. https://doi.org/10.1214/aoms/1177730491
Article MathSciNet MATH Google Scholar
Menouer T (2020) KCSS: Kubernetes container scheduling strategy. J Supercomput. https://doi.org/10.1007/s11227-020-03427-3
Article Google Scholar
OpenMPI Team: Can I force aggressive or degraded performance modes? https://www.open-mpi.org/faq/?category=running
OpenMPI Team: Can I oversubscribe nodes (run more processes than processors)? https://www.open-mpi.org/faq/?category=running
Perarnau S, Essen BCV, Gioiosa R, Iskra K, Gokhale MB, Yoshii K, Beckman P (2019) Argo. In: Operating systems for supercomputers and high performance computing. https://doi.org/10.1007/978-981-13-6624-6_12
Pillet V, Labarta J, Cortes T, Girona S (1995) PARAVER: a tool to visualize and analyze parallel code. In: Proceedings of the 18th World Occam and Transputer User Group Technical Meeting. IOS Press, pp 9–13
Rao J, Wang K, Zhou X, Xu C (2013) Optimizing virtual machine scheduling in NUMA multicore systems. In: Proceedings of the 2013 IEEE 19th international symposium on high performance computer architecture (HPCA), pp 306–317. https://doi.org/10.1109/HPCA.2013.6522328
Roloff E, Diener M, Carissimi A, Navaux POA (2012) High performance computing in the cloud: deployment, performance and cost efficiency. In: Proceedings of the 4th IEEE International Conference on Cloud Computing Technology and Science, pp 371–378. https://doi.org/10.1109/CloudCom.2012.6427549
Rudyy O, Garcia-Gasulla M, Mantovani F, Santiago A, Sirvent R, Vázquez M (2019) Containers in HPC: a scalability and portability study in production biological simulations. In: Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 567–577. https://doi.org/10.1109/IPDPS.2019.00066
Saha P, Beltre A, Govindaraju M (2019) Scylla: a mesos framework for container based MPI jobs. CoRR abs/1905.08386
Saha P, Beltre A, Uminski P, Govindaraju M (2018) Evaluation of docker containers for scientific workloads in the cloud. In: Proceedings of the practice and experience on advanced research computing, PEARC’18. Association for Computing Machinery. https://doi.org/10.1145/3219104.3229280
Sande Veiga V, Simon M, Azab A, Fernandez C, Muscianisi G, Fiameni G, Marocchi S (2019) Evaluation and benchmarking of singularity MPI containers on EU research e-infrastructure. In: Proceedings of the 2019 IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), pp 1–10. https://doi.org/10.1109/CANOPIE-HPC49598.2019.00006
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591–611. https://doi.org/10.1093/biomet/52.3-4.591
Article MathSciNet MATH Google Scholar
Sharma P, Chaufournier L, Shenoy P, Tay YC (2016) Containers and virtual machines at scale. In: Proceedings of the 17th International Conference on Middleware, pp 1–13. https://doi.org/10.1145/2988336.2988337
Sterling T, Anderson M, Brodowicz M (2018) The essential resource management. In: High performance computing, chapter 5. Morgan Kaufmann, Boston, pp 141–190. https://doi.org/10.1016/B978-0-12-420158-3.00005-8
Tesfatsion SK, Klein C, Tordsson J (2018) Virtualization techniques compared: performance, resource, and power usage overheads in clouds. In: Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE ’18. Association for Computing Machinery, pp 145–156. https://doi.org/10.1145/3184407.3184414
Torrez A, Randles T, Priedhorsky R (2019) HPC container runtimes have minimal or no performance impact. In: Proceedings of the 2019 IEEE/ACM international workshop on containers and new orchestration paradigms for isolated environments in HPC (CANOPIE-HPC), pp 37–42. https://doi.org/10.1109/CANOPIE-HPC49598.2019.00010
Tudor BM, Teo YM (2011) A practical approach for performance analysis of shared-memory programs. In: Proceedings of the 2011 IEEE international parallel distributed processing symposium, pp 652–663.https://doi.org/10.1109/IPDPS.2011.68
Vmware: virtualizing high-performance computing (HPC) environments: reference architecture (September) (2018)
Wang Y, Evans RT, Huang L (2019) Performant container support for HPC applications. In: Proceedings of the practice and experience in advanced research computing on rise of the machines (learning), PEARC’19, pp 1–6. Association for Computing Machinery. https://doi.org/10.1145/3332186.3332226
Welch BL (1947) The generalization of student’s problem when several different population variances are involved. Biometrika 34(1–2):28–35. https://doi.org/10.1093/biomet/34.1-2.28
Article MathSciNet MATH Google Scholar
Xavier MG, Neves MV, Rossi FD, Ferreto TC, Lange T, De Rose CAF (2013) Performance evaluation of container-based virtualization for high performance computing environments. In: Proceedings of the 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp 233–240. https://doi.org/10.1109/PDP.2013.41
**ng F, You H, Lu C (2014) HPC benchmark assessment with statistical analysis. Procedia Comput Sci 29:210–219. https://doi.org/10.1016/j.procs.2014.05.019
Article Google Scholar
Yang S, Wang X, An L, Zhang G (2019) Yun: a high-performance container management service based on OpenStack. In: Proceedings of the 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC), pp 202–209. https://doi.org/10.1109/DSC.2019.00038
Younge AJ, Pedretti K, Grant RE, Brightwell R (2017) A tale of two systems: using containers to deploy HPC applications on supercomputers and clouds. In: Proceedings of the 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp 74–81. https://doi.org/10.1109/CloudCom.2017.40

Download references

Acknowledgements

We thank Lenovo for providing the technical infrastructure to run the experiments in this paper. This work was partially supported by Lenovo as part of Lenovo-BSC collaboration agreement, by the Spanish Government under contract PID2019-107255GB-C22, and by the Generalitat de Catalunya under contract 2017-SGR-1414 and under grant 2020 FI-B 00257.

Author information

Authors and Affiliations

Computer Science Department, Barcelona Supercomputing Center (BSC), Barcelona, Spain
Peini Liu & Jordi Guitart
Computer Architecture Department, Universitat Politecnica de Catalunya (UPC), Barcelona, Spain
Peini Liu & Jordi Guitart

Authors

Peini Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Guitart
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peini Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, P., Guitart, J. Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study. J Supercomput 77, 6273–6312 (2021). https://doi.org/10.1007/s11227-020-03518-1

Download citation

Accepted: 16 November 2020
Published: 30 November 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11227-020-03518-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

Containers in HPC: a survey

Exploring the support for high performance applications in the container runtime environment

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance characterization of containerization for HPC workloads on InfiniBand clusters: an empirical study

Containers in HPC: a survey

Exploring the support for high performance applications in the container runtime environment

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation