Monitoring Performance and Power for Application Characterization with the Cache-Aware Roofline Model

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

Abstract

Accurate on-the-fly characterization of application behaviour requires assessing a set of execution-related parameters at runtime, including performance, power and energy consumption. These parameters can be obtained by relying on hardware measurement facilities built-in modern multi-core architectures, such as performance and energy counters. However, current operating systems (OSs) do not provide the means to directly obtain these characterization data. Thus, the user needs to rely on complex custom-built libraries with limited capabilities, which might introduce significant execution and measurement overheads. In this work, we propose two different techniques for efficient performance, power and energy monitoring for systems with modern multi-core CPUs. Here we propose two monitoring tools that allow capturing the run-time behaviour of a wide range of applications at different system levels: (i) at the user-space level, and (ii) at kernel-level, by using the OS scheduler to directly capture this information. Although the importance of the proposed monitoring facilities is patent for many purposes, we focus herein on their employment for application characterization with the recently proposed Cache-aware Roofline model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A scheduler tick is a periodic interruption to update the execution statistics and to check if is necessary to preempt the running task.

  2. 2.

    The Intel i7-3770 K is an Ivy Bridge based micro-architecture with 4 cores. It operates at 3.5 GHz and its memory organization comprises 3 cache levels of 32 KB, 256 KB and 8192 KB, respectively. The DRAM memory controllers support up to two channels (8 B) of DDR3 operating at \(2\times 933\) MHz.

References

  1. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)

    Article  Google Scholar 

  2. OProfile: About oprofile. http://oprofile.sourceforge.net/about/ (2012)

  3. Ilic, A., Pratas, F., Sousa, L.: Cache-aware roofline model: upgrading the loft. IEEE Comput. Architect. Lett. 99, 1 (2013)

    Article  Google Scholar 

  4. Intel: Intel 64 and ia-32 architectures software developer’s manual: volume 3b, pp. 120–251. http://download.intel.com/products/processor/manual/253669.pdf (2012)

  5. PAPI: Papi: Supported platforms: Currently supported. http://icl.cs.utk.edu/papi/custom/index.html?lid=62&slid=96

  6. Curtis-Maury, M., Nikolopoulos, D., Antonopoulos, C.: Pacman: A performance counters manager for intel hyperthreaded processors. In: 3rd International Conference on Quantitative Evaluation of Systems, 2006. QEST 2006, pp. 141–144 (2006)

    Google Scholar 

  7. Eranian, S.: Perfmon2: a flexible performance monitoring interface for linux, Citeseer (2006)

    Google Scholar 

  8. Intel: Intel performance counter monitor - a better way to measure cpu utilization. http://software.intel.com/en-us/articles/intel-performance-counter-monitor-a-better-way-to-measure-cpu-utilization (2012)

  9. Treibig, J., Hager, G., Wellein, G.: Likwid: a lightweight performance-oriented tool suite for x86 multicore environments. In: 2010 39th International Conference on Parallel Processing Workshops (ICPPW), pp. 207–216. IEEE (2010)

    Google Scholar 

  10. LWN.net: Perfcounters added to the mainline. http://lwn.net/Articles/339361 (2009)

  11. Molnar, I.: Goals, design and implementation of the completely fair scheduler. https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt

Download references

Acknowledgments

This work was partially supported by national funds through Fundação para a Ciência e a Tecnologia (FCT) under projects P2HCS (ref. PTDC/EEI-ELC/3152/2012), Threads (ref. PTDC/EEA-ELC/117329/2010), and project PEst-OE/EEI/LA0021/2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonel Sousa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Antão, D., Taniça, L., Ilic , A., Pratas, F., Tomás , P., Sousa, L. (2014). Monitoring Performance and Power for Application Characterization with the Cache-Aware Roofline Model. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55224-3_70

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55223-6

  • Online ISBN: 978-3-642-55224-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation