Abstract
Large-scale parallel machines are programmed mainly with the single program, multiple data (SPMD) model of parallelism. While this model has advantages of scalability and simplicity, it does not fit well with divide-and-conquer parallelism or hierarchical machines that mix shared and distributed memory. In this paper, we define the recursive single program, multiple data model (RSPMD) that extends SPMD with a hierarchical team mechanism to support hierarchical algorithms and machines. We implement this model in the Titanium language and describe how to eliminate a class of deadlocks by ensuring alignment of collective operations. We present application case studies evaluating the RSPMD model, showing that it enables divide-and-conquer algorithms such as sorting to be elegantly expressed and that team collective operations increase performance of conjugate gradient by up to a factor of two. The model also facilitates optimizations for hierarchical machines, improving scalability of particle in cell by 8x and performance of sorting and a stencil code by up to 40 % and 14 %, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Throughout this paper, we highlight team operations in a bold, green color and collective operations in bold purple.
References
Allen, E., et al.: The Fortress language specification, Version 0.866. Sun Microsystem Inc. (2006)
Bailey, D., et al.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)
Bikshandi, G., et al.: Programming for parallelism and locality with hierarchically tiled arrays. In: PPoPP ’06: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2006)
Blelloch, G.E.: NESL: a nested data-parallel language (3.1). Technical report CMU-CS-95-170, Carnegie Mellon University (1995)
Bonachea, D.: GASNet specification, v1.1. Technical report UCB/CSD-02-1207, University of California, Berkeley (2002)
Carlson, W., et al.: Introduction to UPC and language specification. Technical report CCS-TR-99-157, IDA Center for Computing Sciences (1999)
Cray Inc.: Chapel Specification 4 (2005)
Datta, K., Bonachea, D., Yelick, K.A.: Titanium performance and potential: an NPB experimental study. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, pp. 200–214. Springer, Heidelberg (2006)
Fatahalian, K., et al.: Sequoia: programming the memory hierarchy. In: Proceedings of the ACM/IEEE SC 2006 Conference on Supercomputing, SC ’06 (2006)
Garland, M., Kudlur, M., Zheng, Y.: Designing a unified programming model for heterogeneous machines. In: Supercomputing 2012 (2012)
Hardwick, J.C.: Practical parallel divide-and-conquer algorithms. Ph.D. thesis, Carnegie Mellon University (1997)
Huang, J., Chow, Y.: Parallel sorting and data partitioning by sampling. In: 7th International Computer Software and Applications Conference (1983)
Peyton Jones, S.: Harnessing the multicores: nested data parallelism in Haskell. In: Ramalingam, G. (ed.) APLAS 2008. LNCS, vol. 5356, pp. 138–138. Springer, Heidelberg (2008)
Kamil, A.: Single program, multiple data programming for hierarchical computations. Ph.D. thesis, University of California, Berkeley (2012)
Kamil, A.,Yelick, K.: Concurrency analysis for parallel programs with textually aligned barriers. In: Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (2005)
Kamil, A., Yelick, K.A.: Hierarchical pointer analysis for distributed programs. In: Riis Nielson, H., Filé, G. (eds.) SAS 2007. LNCS, vol. 4634, pp. 281–297. Springer, Heidelberg (2007)
Kamil, A., Yelick, K.: Enforcing textual alignment of collectives using dynamic checks. In: Proceedings of the 22nd International Workshop on Languages and Compilers for Parallel Computing (2009)
Kamil, A., Yelick, K.: Hierarchical additions to the SPMD programming model. Technical report UCB/EECS-2012-20, University of California, Berkeley (2012)
Message Passing Interface Forum. MPI: A message-passing interface standard, version 1.1 (1995)
Numrich, R., Reid, J.: Co-array Fortran for parallel programming. Technical report RAL-TR-1998-060, Rutherford Appleton Laboratory (1998)
Yan, Y., et al.: Hierarchical place trees: a portable abstraction for task parallelism and data movement. In: Proceedings of the 22nd International Workshop on Languages and Compilers for Parallel Computing (2009)
Yau, S.M.: Experience in using Titanium for simulation of immersed boundary biological systems. Master’s thesis, University of California, Berkeley (2002)
Yelick, K., et al.: Titanium: a high-performance Java dialect. In: Workshop on Java for High-Performance Network Computing (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kamil, A., Yelick, K. (2014). Hierarchical Computation in the SPMD Programming Model. In: Cașcaval, C., Montesinos, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2013. Lecture Notes in Computer Science(), vol 8664. Springer, Cham. https://doi.org/10.1007/978-3-319-09967-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-09967-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09966-8
Online ISBN: 978-3-319-09967-5
eBook Packages: Computer ScienceComputer Science (R0)