Using PRAM Algorithms on a Uniform-Memory-Access Shared-Memory Architecture

Bader, David A.; Illendula, Ajith K.; Moret, Bernard M. E.; Weisse-Bernstein, Nina R.

doi:10.1007/3-540-44688-5_11

David A. Bader⁷,
Ajith K. Illendula⁸,
Bernard M. E. Moret⁹ &
…
Nina R. Weisse-Bernstein¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2141))

Included in the following conference series:

International Workshop on Algorithm Engineering

475 Accesses
11 Citations

Abstract

The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform shared-memory algorithm from a PRAM algorithm and present the results of an extensive experimental study demonstrating that the resulting programs scale nearly linearly across a significant range of processors (from 1 to 64) and across the entire range of instance sizes tested. This linear speedup with the number of processors is, to our knowledge, the first ever attained in practice for intricate combinatorial problems. The example we present in detail here is a graph decomposition algorithm that also requires the computation of a spanning tree; this problem is not only of interest in its own right, but is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but have no known efficient parallel implementations. Our results thus offer promise for bridging the gap between the theory and practice of shared-memory parallel algorithms. nt[mis|Supported in part by NSF CAREER 00-93039, NSF ITR 00-81404, NSF DEB 99-10123, and DOE CSRI-14968

Supported in part by NSF ITR 00-81404

Supported by an NSF Research Experience for Undergraduates (REU)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (Canada)

eBook: USD 39.99; Price excludes VAT (Canada)

Softcover Book: USD 54.99; Price excludes VAT (Canada)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Hybrid Parallel Search Algorithm for Solving Combinatorial Optimization Problems on Multicore Clusters

Hierarchical Computation in the SPMD Programming Model

Partitioning Strategy Selection for In-Memory Graph Pattern Matching on Multiprocessor Systems

References

A. Aggarwal, B. Alpern, A. Chandra, and M. Snir. A Model for Hierarchical Memory. In Proceedings of the 19th Annual ACM Symposium of Theory of Computing (STOC), pages 305–314, New York City, May 1987.
Google Scholar
A. Aggarwal and J. Vitter. The Input/Output Complexity of Sorting and Related Problems. Communications of the ACM, 31:1116–1127, 1988.
Article MathSciNet Google Scholar
B. Alpern, L. Carter, E. Feig, and T. Selker. The Uniform Memory Hierarchy Model of Computation. Algorithmica, 12:72–109, 1994.
Article MATH MathSciNet Google Scholar
N. M. Amato, J. Perdue, A. Pietracaprina, G. Pucci, and M. Mathis. Predicting performance on SMPs. a case study: The SGI Power Challenge. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2000), pages 729–737, Cancun, Mexico, May 2000.
Google Scholar
D. A. Bader and J. JáJá. SIMPLE: A Methodology for Programming High Performance Algorithms on Clusters of Symmetric Multiprocessors (SMPs). Journal of Parallel and Distributed Computing, 58(1):92–108, 1999.
Article Google Scholar
G. E. Blelloch, P. B. Gibbons, Y. Matias, and M. Zagha. Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 8(9):943–958, 1997.
Article Google Scholar
M. H. Carvalho, C. L. Lucchesi, and U. S. R. Murty. Ear Decompositions of Matching Covered Graphs. Combinatorica, 19(2):151–174, 1999.
Article MATH MathSciNet Google Scholar
A. Charlesworth. Starfire: extending the SMP envelope. IEEE Micro, 18(1):39–49, 1998.
Article Google Scholar
J. Chen and S. P. Kanchi. Graph Ear Decompositions and Graph Embeddings. SI AM Journal on Discrete Mathematics, 12(2):229–242, 1999.
Article MATH MathSciNet Google Scholar
P. Crescenzi, C. Demetrescu, I. Finocchi, and R. Petreschi. LEONARDO: A Software Visualization System. In Proceedings of the First Workshop on Algorithm Engineering (WAE’97), pages 146–155, Venice, Italy, sep 1997.
Google Scholar
D. Eppstein. Parallel Recognition of Series Parallel Graphs. Information & Computation, 98:41–55, 1992.
Article MATH MathSciNet Google Scholar
D. S. Franzblau. Combinatorial Algorithm for a Lower Bound on Frame Rigidity. SI AM Journal on Discrete Mathematics, 8(3):388–400, 1995.
Article MATH MathSciNet Google Scholar
D. S. Franzblau. Ear Decomposition with Bounds on Ear Length. Information Processing Letters, 70(5):245–249, 1999.
Article MATH MathSciNet Google Scholar
D. S. Franzblau. Generic Rigidity of Molecular Graphs Via Ear Decomposition. Discrete Applied Mathematics, 101(1–3):131–155, 2000.
Article MATH MathSciNet Google Scholar
P. B. Gibbons, Y. Matias, and V. Ramachandran. Can shared-memory model serve as a bridging model for parallel computation? In Proceedings of the 9th annual ACM symposium on parallel algorithms and architectures, pages 72–83, Newport, RI, June 1997.
Google Scholar
P. B. Gibbons, Y. Matias, and V. Ramachandran. The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms. SIAM Journal on Computing, 28(2):733–769, 1998.
Article MATH MathSciNet Google Scholar
B. Grayson, M. Dahlin, and V. Ramachandran. Experimental evaluation of QSM, a simple shared-memory model. In Proceedings of the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP), pages 1–7, San Juan, Puerto Rico, April 1999.
Google Scholar
D. R. Helman and J. JáJá. Designing Practical Efficient Algorithms for Symmetric Multiprocessors. In Algorithm Engineering and Experimentation (ALENEX’99), pages 37–56, Baltimore, MD, January 1999.
Google Scholar
T.-S. Hsu and V. Ramachandran. Efficient massively parallel implementation of some combinatorial algorithms. Theoretical Computer Science, 162(2):297–322, 1996.
Article MATH MathSciNet Google Scholar
T.-S. Hsu, V. Ramachandran, and N. Dean. Implementation of parallel graph algorithms on a massively parallel SIMD computer with virtual processing. In Proceedings of the 9th International Parallel Processing Symposium, pages 106–112, Santa Barbara, CA, April 1995.
Google Scholar
L. Ibarra and D. Richards. Efficient Parallel Graph Algorithms Based on Open Ear Decomposition. Parallel Computing, 19(8):873–886, 1993.
Article MATH MathSciNet Google Scholar
J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley Publishing Company, New York, 1992.
MATH Google Scholar
A. Kanevsky and V. Ramachandran. Improved Algorithms for Graph Four-Connectivity. Journal of Computer and System Sciences, 42(3):288–306, 1991.
Article MATH MathSciNet Google Scholar
D. J. Kavvadias, G. E. Pantziou, P. G. Spirakis, and C. D. Zaroliagis. Hammock-On-Ears Decomposition: A Technique for the Efficient Parallel Solution of Shortest Paths and Other Problems. Theoretical Computer Science, 168(1):121–154, 1996.
Article MATH MathSciNet Google Scholar
A. Kazmierczak and S. Radhakrishnan. An Optimal Distributed Ear Decomposition Algorithm with Applications to Biconnectivity and Outerplanarity Testing. IEEE Transactions on Parallel and Distributed Systems, 11(1):110–118, 2000.
Article Google Scholar
J. Keller, C. W. Keßler, and J. L. Träff. Practical PRAM Programming. John Wiley & Sons, 2001.
Google Scholar
R. Ladner, J. D. Fix, and A. LaMarca. The cache performance of traversals and random accesses. In Proc. 10th Ann. ACM/SIAM Symposium on Discrete Algorithms (SODA-99), pages 613–622, Baltimore, MD, 1999.
Google Scholar
A. LaMarca and R. E. Ladner. The Influence of Caches on the Performance of Heaps. ACM Journal of Experimental Algorithmics, 1(4), 1996. http://www.jea.acm.org/1996/LaMarcaInfluence/.
A. LaMarca and R. E. Ladner. The Influence of Caches on the Performance of Heaps. In Proceedings of the Eighth ACM/SIAM Symposium on Discrete Algorithms, pages 370–379, New Orleans, LA, 1997.
Google Scholar
L. Lovász. Computing Ears and Branchings in Parallel. In Proceedings of the 26th Annual IEEE Symposium on Foundations of Computer Science (FOCS 85), pages 464–467, Portland, Oregon, October 1985.
Google Scholar
Message Passing Interface Forum. MPI: A Message-Passing Interface Standard. Technical report, University of Tennessee, Knoxville, TN, June 1995. Version 1.1.
Google Scholar
G. L. Miller and V. Ramachandran. Efficient parallel ear decomposition with applications. Manuscript, UC Berkeley, MSRI, January 1986.
Google Scholar
Y. Moan, B. Schieber, and U. Vishkin. Parallel ear decomposition search (EDS) and st-numbering in graphs. Theoretical Computer Science, 47(3):277–296, 1986.
Article MathSciNet Google Scholar
B. M. E. Moret. Towards a discipline of experimental algorithmics. In DIM ACS Monographs in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, 2001. To appear. Available at www.cs.unm.edu/~moret/dimacs.ps.
OpenMP Architecture Review Board. OpenMP: A Proposed Industry Standard API for Shared Memory Programming. http://www.openmp.org/, October 1997.
Portable Applications Standards Committee of the IEEE. Information technology-Portable Operating System Interface (POSIX)-Part 1: System Application Program Interface (API), 1996-07-12 edition, 1996. ISO/IEC 9945-1, ANSI/IEEE Std. 1003.1.
Google Scholar
V. Ramachandran. Parallel Open Ear Decomposition with Applications to Graph Biconnectivity and Triconnectivity. In J. H. Reif, editor, Synthesis of Parallel Algorithms, pages 275–340. Morgan Kaufman, San Mateo, CA, 1993.
Google Scholar
V. Ramachandran. A General-Purpose Shared-Memory Model for Parallel Computation. In M. T. Heath, A. Ranade, and R. S. Schreiber, editors, Algorithms for Parallel Processing, volume 105, pages 1–18. Springer-Verlag, New York, 1999.
Google Scholar
M. Reid-Miller. List ranking and list scan on the Cray C-90. In Proceedings Symposium on Parallel Algorithms and Architectures, pages 104–113, Cape May, NJ, June 1994.
Google Scholar
M. Reid-Miller. List ranking and list scan on the Cray C-90. Journal of Computer and System Sciences, 53(3):344–356, December 1996.
Google Scholar
J. H. Reif, editor. Synthesis of Parallel Algorithms. Morgan Kaufmann Publishers, 1993.
Google Scholar
C. Savage and J. JáJá. Fast, Efficient Parallel Algorithms for Some Graph Problems. SIAM Journal on Computing, 10(4):682–691, 1981.
Article MATH MathSciNet Google Scholar
J. R. Shewchuk. Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In M. C. Lin and D. Manocha, editors, Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science, pages 203–222. Springer-Verlag, May 1996. From the First ACM Workshop on Applied Computational Geometry.
Chapter Google Scholar
J. Sibeyn. Better trade-offs for parallel list ranking. In Proceedings of the 9th annual ACM symposium on parallel algorithms and architectures, pages 221–230, Newport, RI, June 1997.
Google Scholar
J. Vitter and E. Shriver. Algorithms for Parallel Memory I: Two-Level Memories. Algorithmica, 12:110–147, 1994.
Article MATH MathSciNet Google Scholar
H. Whitney. Non-Separable and Planar Graphs. Transactions of the American Mathematical Society, 34:339–362, 1932.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, 87131, USA
David A. Bader
Intel Corporation, Rio Rancho, NM, 87124, USA
Ajith K. Illendula
Department of Computer Science, University of New Mexico, Albuquerque, NM, 87131, USA
Bernard M. E. Moret
University of New Mexico, Albuquerque, NM, 87131, USA
Nina R. Weisse-Bernstein

Authors

David A. Bader
View author publications
You can also search for this author in PubMed Google Scholar
Ajith K. Illendula
View author publications
You can also search for this author in PubMed Google Scholar
Bernard M. E. Moret
View author publications
You can also search for this author in PubMed Google Scholar
Nina R. Weisse-Bernstein
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Aarhus, BRICS, 8000, Åarhus, Denmark
Gerth Stølting Brodal
Dipartimento di Ingegneria Elettrica, Universitá dell’Aquila, Poggio di Roio, 67040, L’Aquila, Italy
Daniele Frigioni
Dipartimento di Informatica e Sistemistica, Universitá di Roma “La Sapienza”, via Salaria 113, 00198, Roma, Italy
Alberto Marchetti-Spaccamela

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bader, D.A., Illendula, A.K., Moret, B.M.E., Weisse-Bernstein, N.R. (2001). Using PRAM Algorithms on a Uniform-Memory-Access Shared-Memory Architecture. In: Brodal, G.S., Frigioni, D., Marchetti-Spaccamela, A. (eds) Algorithm Engineering. WAE 2001. Lecture Notes in Computer Science, vol 2141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44688-5_11

Download citation

DOI: https://doi.org/10.1007/3-540-44688-5_11
Published: 17 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42500-7
Online ISBN: 978-3-540-44688-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Using PRAM Algorithms on a Uniform-Memory-Access Shared-Memory Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Hybrid Parallel Search Algorithm for Solving Combinatorial Optimization Problems on Multicore Clusters

Hierarchical Computation in the SPMD Programming Model

Partitioning Strategy Selection for In-Memory Graph Pattern Matching on Multiprocessor Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using PRAM Algorithms on a Uniform-Memory-Access Shared-Memory Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Hybrid Parallel Search Algorithm for Solving Combinatorial Optimization Problems on Multicore Clusters

Hierarchical Computation in the SPMD Programming Model

Partitioning Strategy Selection for In-Memory Graph Pattern Matching on Multiprocessor Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation