Hardware-Aware Compilation

  • Living reference work entry
  • First Online:
Handbook of Hardware/Software Codesign
  • 186 Accesses

Abstract

Hardware-aware compilers are in high demand for embedded systems with stringent multidimensional design constraints on cost, power, performance, etc. By making use of the microarchitectural information about a processor, a hardware-aware compiler can generate more efficient code than a generic compiler while meeting the design constraints, by exploiting those highly customized microarchitectural features. In this chapter, we introduce two applications of hardware-aware compilers: a hardware-aware compiler can be used as a production compiler and as a tool to efficiently explore the design space of embedded processors. We demonstrate the first application with a compiler that generates efficient code for embedded processors that do not have any branch predictor to reduce branch penalties. To demonstrate the second application, we show how a hardware-aware compiler can be used to explore the Design Space of the bypass designs in the processor. In both the cases, the hardware-aware compiler can generate better code than a hardware-ignorant compiler.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

References

  1. Bala V, Rubin N (1995) Efficient instruction scheduling using finite state automata. In: Proceedings of the 28th annual international symposium on microarchitecture, pp 46–56. doi10.1109/MICRO.1995.476812

    Google Scholar 

  2. Ball T, Larus JR (1993) Branch prediction for free. In: Proceedings of PLDI. ACM, New York, pp 300–313. doi10.1145/155090.155119

    Google Scholar 

  3. Chen T, Raghavan R, Dale JN, Iwata E (2007) Cell broadband engine architecture and its first implementation – a performance view. IBM J Res Dev 51(5):559–572. doi10.1147/rd.515.0559

    Google Scholar 

  4. Dual-Core Intel Itanium Processor 9000 and 9100 Series (2007). http://download.intel.com/design/itanium/downloads/314054.pdf

  5. Flachs et al B (2006) The microarchitecture of the synergistic processor for a cell processor. IEEE Solid-State Circuits 41(1):63–70

    Google Scholar 

  6. Fog A (2008) The microarchitecture of Intel and AMD CPUs

    Google Scholar 

  7. GNU Toolchain 4.1.1 and GDB for the Cell BE’s PPU/SPU. http://www.bsc.es/plantillaH.php?cat_id=304

  8. Grun P, Dutt N, Nicolau A Memory aware compilation through accurate timing extraction. In: Proceedings of the 37th annual design automation conference, DAC’00. ACM, New York, pp 316–321 (2000). doi10.1145/337292.337428

    Google Scholar 

  9. Grun P, Dutt N, Nicolau A (2000) MIST: an algorithm for memory miss traffic management. In: IEEE/ACM international conference on computer aided design, ICCAD-2000, pp 431–437. doi10.1109/ICCAD.2000.896510

    Google Scholar 

  10. Grun P, Halambi A, Dutt N, Nicolau A (2003) RTGEN-an algorithm for automatic generation of reservation tables from architectural descriptions. IEEE Trans Very Large Scale Integr (VLSI) Syst 11(4):731–737. doi10.1109/TVLSI.2003.813011

    Google Scholar 

  11. Halambi A, Grun P, Ganesh V, Khare A, Dutt N, Nicolau A (1999) EXPRESSION: a language for architecture exploration through compiler/simulator retargetability. In: Design, automation and test in Europe conference and exhibition 1999. Proceedings, pp 485–490. doi10.1109/DATE.1999.761170

    Google Scholar 

  12. Hoffmann A, Schliebusch O, Nohl A, Braun G, Wahlen O, Meyr H (2001) A methodology for the design of application specific instruction set processors (ASIP) using the machine description language LISA. In: Proceedings of the 2001 IEEE/ACM international conference on computer-aided design, ICCAD’01. IEEE Press, Piscataway, pp 625–630

    Google Scholar 

  13. https://gcc.gnu.org/ (2007)

  14. IBM: Cell Broadband Engine Programming Handbook including PowerXCell 8i. https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/7A77CCDF14FE70D5852575CA0074E8ED

  15. Intel Corporation. Intel XScale(R) Core: Developer’s Manual. http://www.intel.com/design/iio/manuals/273411.htm

  16. Keutzer K, Malik S, Newton A (2002) From ASIC to ASIP: the next design discontinuity. In: IEEE international conference on computer design: VLSI in computers and processors, 2002. Proceedings, pp 84–90. doi10.1109/ICCD.2002.1106752

    Google Scholar 

  17. Kondo M, Kobyashi H, Sakamoto R, Wada M, Tsukamoto J, Namiki M, Wang W, Amano H, Matsunaga K, Kudo M, Usami K, Komoda T, Nakamura H (2014) Design and evaluation of fine-grained power-gating for embedded microprocessors. In: Design, automation and test in Europe conference and exhibition (DATE), pp 1–6. doi10.7873/DATE.2014.158

    Google Scholar 

  18. Kongetira P, Aingaran K, Olukotun K (2005) Niagara: a 32-way multithreaded sparc processor. IEEE Micro 25(2):21–29. doi10.1109/MM.2005.35

    Google Scholar 

  19. Lattner C (2002) LLVM: an infrastructure for multi-stage optimization. Master’s thesis, Computer Science Department, University of Illinois at Urbana-Champaign, Urbana. See http://llvm.cs.uiuc.edu

  20. Leupers R (2000) Code generation for embedded processors. In: The 13th international symposium on system synthesis, 2000. Proceedings, pp 173–178. doi10.1109/ISSS.2000.874046

    Google Scholar 

  21. Lowney PG, Freudenberger SM, Karzes TJ, Lichtenstein WD, Nix RP, O’Donnell JS, Ruttenberg JC (1993) The multiflow trace scheduling compiler. J Supercomput 7:51–142

    Article  Google Scholar 

  22. Lu J, Kim Y, Shrivastava A, Huang C (2011) Branch penalty reduction on IBM cell SPUs via software branch hinting. In: Proceedings of CODES+ISSS, pp 355–364

    Google Scholar 

  23. Muchnick SS (1997) Advanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  24. Park D, Lee J, Kim NS, Kim T (2010) Optimal algorithm for profile-based power gating: a compiler technique for reducing leakage on execution units in microprocessors. In: 2010 IEEE/ACM international conference on computer-aided design (ICCAD), pp 361–364. doi10.1109/ICCAD.2010.5653652

    Google Scholar 

  25. Patterson D, Anderson T, Cardwell N, Fromm R, Keeton K, Kozyrakis C, Thomas R, Yelick K (1997) A case for intelligent RAM. IEEE Micro 17(2):34–44. doi10.1109/40.592312

    Google Scholar 

  26. Proebsting TA, Fraser CW (1994) Detecting pipeline structural hazards quickly. In: Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL’94. ACM, New York, pp 280–286. doi10.1145/174675.177904

    Google Scholar 

  27. Roy S, Katkoori S, Ranganathan N (2007) A compiler based leakage reduction technique by power-gating functional units in embedded microprocessors. In: 20th international conference on VLSI Design, 2007. Held jointly with 6th international conference on embedded systems, pp 215–220. doi10.1109/VLSID.2007.10

    Google Scholar 

  28. Shrivastava A (2006) Compiler-in-loop exploration of programmable embedded systems. Ph.D. thesis, Donald Bren School of Information and Computer Sciences

    Google Scholar 

  29. Shrivastava A, Issenin I, Dutt N (2005) Compilation techniques for energy reduction in horizontally partitioned cache architectures. In: Proceedings of the 2005 international conference on compilers, architectures and synthesis for embedded systems, CASES’05. ACM, New York, pp 90–96. doi10.1145/1086297.1086310

    Google Scholar 

  30. Siska C (1998) A processor desription language supporting retargetable multi-pipeline DSP program development tools. In: Proceedings of the 11th international symposium on system synthesis, ISSS’98. IEEE Computer Society, Washington, DC, pp 31–36

    Google Scholar 

  31. Trimaran. http://www.trimaran.org/

  32. Wagner TA, Maverick V, Graham SL, Harrison MA (1994) Accurate static estimators for program optimization. In: Proceedings of the ACM SIGPLAN 1994 conference on programming language design and implementation, PLDI’94. ACM, New York, pp 85–96. doi10.1145/178243.178251

    Google Scholar 

  33. Wu Y, Larus JR (1994) Static branch frequency and program profile analysis. In: Proceedings of the 27th annual international symposium on Microarchitecture. ACM, New York, pp 1–11. doi10.1145/192724.192725

    Google Scholar 

  34. Zivojnovic V, Pees S, Meyr H (1996) LISA-machine description language and generic machine model for HW/SW co-design. In: Workshop on VLSI signal processing, IX, pp 127–136. doi10.1109/VLSISP.1996.558311

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aviral Shrivastava .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this entry

Cite this entry

Shrivastava, A., Cai, J. (2016). Hardware-Aware Compilation. In: Ha, S., Teich, J. (eds) Handbook of Hardware/Software Codesign. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7358-4_26-1

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-7358-4_26-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-017-7358-4

  • Online ISBN: 978-94-017-7358-4

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Navigation