AntSM: Efficient Debugging for Shared Memory Parallel Programs

Lee, Jae-Woo; Midkiff, Samuel P.

doi:10.1007/978-3-319-09967-5_12

Jae-Woo Lee¹⁷ &
Samuel P. Midkiff¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8664))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

648 Accesses

Abstract

This paper describes AntSM, a system that uses the inherent parallelism of multi-threaded programs to reduce the overhead of statistical and invariant violations detection-based debugging tools. The runtime monitoring of these tools leads to high overheads. The key insight of the AntSM system is that this overhead can be reduced in parallel programs by performing sampled monitoring across parallel regions of the program that are performing similar actions. AntSM implements this sampling using a combination of static and dynamic analyses to determine similar parts of the program executing in parallel and the number of threads executing those parts of the program. Experimental results, performed using the C-DIDUCE (a variant of DIDUCE for C) debugging tool on eleven Pthreads benchmarks from the PARSEC suite, show monitoring overhead is reduced by up to 18.14 times (and on average 8.73 times) on an eight-core machine relative to a naive port that performs no sampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 42.79; Price includes VAT (Germany)

Softcover Book: EUR 53.49; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Ant: A Debugging Framework for MPI Parallel Programs

PARCOACH Extension for Hybrid Applications with Interprocedural Analysis

ParaShares: Finding the Important Basic Blocks in Multithreaded Programs

Notes

1.
AccMon uses special hardware.
2.
The differences between DIDUCE and C-DIDUCE come from the former targeting Java and the latter C. These differences are explained in [3].
3.
No significant technical challenge prevents us from using OpenMP.

References

Software errors cost U.S. economy ${\$}59.5$ billion annually. NIST News Release 2002–10
Google Scholar
Hangal, S., Lam, M.S.: Tracking down software bugs using automatic anomaly detection. In: Proceedings of the 24th International Conference on Software Engineering, pp. 291–301 (2002)
Google Scholar
Fei, L., Midkiff, S.P.: Artemis: practical runtime monitoring of applications for execution anomalies. In: PLDI ’06, pp. 84–95, New York, NY, USA (2006)
Google Scholar
Zhou, P., Liu, W., Fei, L., Lu, S., Qin, F., Zhou, Y., Midkiff, S.P., Torrellas, J.: AccMon: automatically detecting memory-related bugs via program counter-based invariants. In: Proceedings of MICRO’04 (2004)
Google Scholar
Liblit, B., Naik, M., Zheng, A.X., Aiken, A., Jordan, M.I.: Scalable statistical bug isolation. In: PLDI ’05 (2005)
Google Scholar
Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: PLDI ’03, pp. 141–154 (2003)
Google Scholar
Liu, C., Yan, X., Fei, L., Han, J., Midkiff, S.P.: Sober: statistical model-based bug localization. In: ESEC/FSE-13: 10th European Software Engineering Conference Held Jointly with 13th International Symposium on Foundations of Software Engineering (2005)
Google Scholar
The PARSEC Benchmark Suite. http://parsec.cs.princeton.edu
Hutchins, M., Foster, H., Goradia, T., Ostrand, T.: Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In: International Conference on Software Engineering, ICSE ’94, pp. 191–200, Los Alamitos, CA, USA (1994)
Google Scholar
Ernst, M.D., Czeisler, A., Griswold, W.G., Notkin, D.: Quickly detecting relevant program invariants. In: Proceedings of the 22nd International Conference on Software Engineering, pp. 449–458 (2000)
Google Scholar
The LLVM Compiler Infrastructure. http://llvm.org
Lee, J.-W., Bachega, L.R., Midkiff, S.P., Hu, Y.C.: Ant: a debugging framework for MPI parallel programs. In: Kasahara, H., Kimura, K. (eds.) LCPC 2012. LNCS, vol. 7760, pp. 220–233. Springer, Heidelberg (2013)
Chapter Google Scholar
Totalview user guide. Accessed 28 Sept 2012
Google Scholar
Lumetta, S.S., Culler, D.E.: The mantis parallel debugger. In: SPDT ’96: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, pp. 118–126, New York, NY, USA (1996)
Google Scholar
Sistare, S., Dorenkamp, E., Nevin, N., Loh, E.: MPI support in the Prism programming environment. In: Supercomputing ’99, pp. 22 (1999)
Google Scholar
Wismuller, R., Oberhubera, M., Krammera, J., Hansenb, O.: Interactive debugging and performance analysis of massively parallel applications. Parallel Comput. 22(3), 415–442 (1996)
Article Google Scholar
Stringhini, D., Navaux, P., de Kergommeaux, J.C.: A selection mechanism to group processes in a parallel debugger. In: Proceedings of 2000 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’00), June 2000
Google Scholar
Cheng, D., Hood, R.: A portable debugger for parallel and distributed programs. In: Supercomputing ’94, pp. 723–732, November 1994
Google Scholar
Mirgorodskiy, A.V., Maruyama, N., Miller, B.P.: Problem diagnosis in large-scale computing environments. In: SC ’06, pp. 88. ACM (2006)
Google Scholar
Gao, Q., Qin, F., Panda, D.K.: DMTracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements. In: SC ’07. ACM (2007)
Google Scholar
Arnold, D.C., Ahn, D.H., de Supinski, B.R., Lee, G.L., Miller, B.P., Schulz, M.: Stack trace analysis for large scale debugging. Parallel and Distributed Processing Symposium, p. 64 (2007)
Google Scholar
Lee, G.L., Ahn, D.H., Arnold, D.C., de Supinski, B.R., Legendre, M., Miller, B.P., Schulz, M., Liblit, B.: Lessons learned at 208k: towards debugging millions of cores. In: SC ’08, pp. 1–9, Piscataway, NJ, USA (2008)
Google Scholar
Strom, R.E., Bacon, D.F., Goldberg, A.P., Lowry, A., Yellin, D.M., Yemini, S.A.: Hermes: A Language for Distributed Computing. Prentice-Hall Inc., Upper Saddle River (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA
Jae-Woo Lee & Samuel P. Midkiff

Authors

Jae-Woo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Samuel P. Midkiff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samuel P. Midkiff .

Editor information

Editors and Affiliations

Silicon Valley, Qualcomm Research, San Jose, California, USA
Călin Cașcaval
Silicon Valley, Qualcomm Research, San Jose, California, USA
Pablo Montesinos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, JW., Midkiff, S.P. (2014). AntSM: Efficient Debugging for Shared Memory Parallel Programs. In: Cașcaval, C., Montesinos, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2013. Lecture Notes in Computer Science(), vol 8664. Springer, Cham. https://doi.org/10.1007/978-3-319-09967-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-09967-5_12
Published: 01 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09966-8
Online ISBN: 978-3-319-09967-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AntSM: Efficient Debugging for Shared Memory Parallel Programs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Ant: A Debugging Framework for MPI Parallel Programs

PARCOACH Extension for Hybrid Applications with Interprocedural Analysis

ParaShares: Finding the Important Basic Blocks in Multithreaded Programs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

AntSM: Efficient Debugging for Shared Memory Parallel Programs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Ant: A Debugging Framework for MPI Parallel Programs

PARCOACH Extension for Hybrid Applications with Interprocedural Analysis

ParaShares: Finding the Important Basic Blocks in Multithreaded Programs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation