Hierarchical Computation in the SPMD Programming Model

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8664))

Abstract

Large-scale parallel machines are programmed mainly with the single program, multiple data (SPMD) model of parallelism. While this model has advantages of scalability and simplicity, it does not fit well with divide-and-conquer parallelism or hierarchical machines that mix shared and distributed memory. In this paper, we define the recursive single program, multiple data model (RSPMD) that extends SPMD with a hierarchical team mechanism to support hierarchical algorithms and machines. We implement this model in the Titanium language and describe how to eliminate a class of deadlocks by ensuring alignment of collective operations. We present application case studies evaluating the RSPMD model, showing that it enables divide-and-conquer algorithms such as sorting to be elegantly expressed and that team collective operations increase performance of conjugate gradient by up to a factor of two. The model also facilitates optimizations for hierarchical machines, improving scalability of particle in cell by 8x and performance of sorting and a stencil code by up to 40 % and 14 %, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Throughout this paper, we highlight team operations in a bold, green color and collective operations in bold purple.

References

  1. Allen, E., et al.: The Fortress language specification, Version 0.866. Sun Microsystem Inc. (2006)

    Google Scholar 

  2. Bailey, D., et al.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)

    Article  Google Scholar 

  3. Bikshandi, G., et al.: Programming for parallelism and locality with hierarchically tiled arrays. In: PPoPP ’06: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2006)

    Google Scholar 

  4. Blelloch, G.E.: NESL: a nested data-parallel language (3.1). Technical report CMU-CS-95-170, Carnegie Mellon University (1995)

    Google Scholar 

  5. Bonachea, D.: GASNet specification, v1.1. Technical report UCB/CSD-02-1207, University of California, Berkeley (2002)

    Google Scholar 

  6. Carlson, W., et al.: Introduction to UPC and language specification. Technical report CCS-TR-99-157, IDA Center for Computing Sciences (1999)

    Google Scholar 

  7. Cray Inc.: Chapel Specification 4 (2005)

    Google Scholar 

  8. Datta, K., Bonachea, D., Yelick, K.A.: Titanium performance and potential: an NPB experimental study. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, pp. 200–214. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Fatahalian, K., et al.: Sequoia: programming the memory hierarchy. In: Proceedings of the ACM/IEEE SC 2006 Conference on Supercomputing, SC ’06 (2006)

    Google Scholar 

  10. Garland, M., Kudlur, M., Zheng, Y.: Designing a unified programming model for heterogeneous machines. In: Supercomputing 2012 (2012)

    Google Scholar 

  11. Hardwick, J.C.: Practical parallel divide-and-conquer algorithms. Ph.D. thesis, Carnegie Mellon University (1997)

    Google Scholar 

  12. Huang, J., Chow, Y.: Parallel sorting and data partitioning by sampling. In: 7th International Computer Software and Applications Conference (1983)

    Google Scholar 

  13. Peyton Jones, S.: Harnessing the multicores: nested data parallelism in Haskell. In: Ramalingam, G. (ed.) APLAS 2008. LNCS, vol. 5356, pp. 138–138. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Kamil, A.: Single program, multiple data programming for hierarchical computations. Ph.D. thesis, University of California, Berkeley (2012)

    Google Scholar 

  15. Kamil, A.,Yelick, K.: Concurrency analysis for parallel programs with textually aligned barriers. In: Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (2005)

    Google Scholar 

  16. Kamil, A., Yelick, K.A.: Hierarchical pointer analysis for distributed programs. In: Riis Nielson, H., Filé, G. (eds.) SAS 2007. LNCS, vol. 4634, pp. 281–297. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  17. Kamil, A., Yelick, K.: Enforcing textual alignment of collectives using dynamic checks. In: Proceedings of the 22nd International Workshop on Languages and Compilers for Parallel Computing (2009)

    Google Scholar 

  18. Kamil, A., Yelick, K.: Hierarchical additions to the SPMD programming model. Technical report UCB/EECS-2012-20, University of California, Berkeley (2012)

    Google Scholar 

  19. Message Passing Interface Forum. MPI: A message-passing interface standard, version 1.1 (1995)

    Google Scholar 

  20. Numrich, R., Reid, J.: Co-array Fortran for parallel programming. Technical report RAL-TR-1998-060, Rutherford Appleton Laboratory (1998)

    Google Scholar 

  21. Yan, Y., et al.: Hierarchical place trees: a portable abstraction for task parallelism and data movement. In: Proceedings of the 22nd International Workshop on Languages and Compilers for Parallel Computing (2009)

    Google Scholar 

  22. Yau, S.M.: Experience in using Titanium for simulation of immersed boundary biological systems. Master’s thesis, University of California, Berkeley (2002)

    Google Scholar 

  23. Yelick, K., et al.: Titanium: a high-performance Java dialect. In: Workshop on Java for High-Performance Network Computing (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Kamil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kamil, A., Yelick, K. (2014). Hierarchical Computation in the SPMD Programming Model. In: Cașcaval, C., Montesinos, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2013. Lecture Notes in Computer Science(), vol 8664. Springer, Cham. https://doi.org/10.1007/978-3-319-09967-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09967-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09966-8

  • Online ISBN: 978-3-319-09967-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation