On Chip Network Routing for Tera-Scale Architectures

Vaidya, Aniruddha S.; Azimi, Mani; Kumar, Akhilesh

doi:10.1007/978-1-4614-8274-1_14

Aniruddha S. Vaidya³,
Mani Azimi⁴ &
Akhilesh Kumar⁴

2057 Accesses
1 Altmetric

Abstract

The emergence of Tera-scale architectures features the interconnection of tens to several hundred general purpose cores to each other and with other IP blocks. The high level requirements of the underlying interconnect infrastructure include low latency, high-throughput, scalable performance, flexible and adaptive routing, support for isolated partitions, fault-tolerance, and support for irregular or partially enabled configurations. This chapter presents the architecture and routing algorithms for supporting these requirements in the overall framework of mesh and torus-based point-to-point interconnect topologies. The requirements and desired attributes for tera-scale interconnects are outlined. This is followed by an overview of the interconnect architecture and micro-architecture framework. The descriptions of various routing algorithms supported are at the heart of the chapter, and include various minimal deterministic and adaptive routing algorithms for mesh and torus networks, a novel load-balanced routing algorithm called pole-routing, and performance-isolation routing in non-rectangular mesh partitions. The implementation aspects of these topics is covered through an overview of the environment for prototy**, debugging, performance evaluation and visualization in the context of specific interconnect configurations of interest. Overall, this chapter aims to illustrate a comprehensive approach in architecting (and micro-architecting) a scalable and flexible on-die interconnect and associated routing algorithms that are applicable to a wide range of applications in an industry setting.

This chapter includes material adapted from our earlier publications [1–3].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

RiCoBiT—Ring Connected Binary Tree: A Structured and Scalable Architecture for Network-on-Chip Based Systems: an Exclusive Summary

References

M. Azimi, N. Cherukuri, D.N. Jayasimha, A. Kumar, P. Kundu, S. Park, I. Schoinas, A.S. Vaidya, Integration challenges and tradeoffs for tera-scale architectures. Intel Technol. J. 11(3), 173–184 (2007)
Google Scholar
M. Azimi, D. Dai, A. Kumar, A. Mejia, D. Park, S. Saharoy, A.S. Vaidya, Flexible and adaptive on-chip interconnect for tera-scale architectures. Intel Technol. J. 13(4), 62–79 (2009)
Google Scholar
M. Azimi, D. Dai, A. Kumar, A.S. Vaidya, On-chip interconnect trade-offs for tera-scale many-core processors, in Designing Network-on-Chip Architectures in the Nanoscale Era, ed. by J. Flich, D. Bertozzi (Chapman & Hall/CRC, Boca Raton, 2011)
Google Scholar
R.V. Bopanna, S. Chalasani, Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans. Comput. 44(7), 848–864 (1995)
Article Google Scholar
G.M. Chiu, The odd-even turn model for adaptive routing. IEEE Trans. Parallel Distrib. Syst. 11(7), 729–738 (2000)
Article Google Scholar
G. Chrysos, Intel Xeon Phi coprocessor (codename Knights Corner). In: Hot Chips (2012)
Google Scholar
D. Dai, A.S. Vaidya, S. Saharoy, S. Park, D. Park, H.L. Thantry, R. Plate, E. Maas, A. Kumar, M. Azimi, FPGA-based prototy** of A 2D mesh/torus on-chip interconnect, in ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2010, p. 293. Abstract only
Google Scholar
W.J. Dally, Virtual channel flow control. IEEE Trans. Parallel Distrib. Syst 3(2), 194–205 (1992)
Article Google Scholar
W.J. Dally, C.L. Seitz, Deadlock free message routing in multiprocessor interconnection networks. IEEE Trans. Comput. 36(5), 547–553 (1987)
Article MATH Google Scholar
W.J. Dally, B. Towles, Principles and Practices of Interconnection Networks (Morgan Kaufmann, San Francisco, 2004)
Google Scholar
S. Dighe, S. Vangal, P. Aseron, S. Kumar, T. Jacob, K. Bowman, J. Howard, J. Tschanz, V. Erraguntla, N. Borkar, V. De, S. Borkar, Within-die variation-aware dynamic-voltage-frequency scaling core map** and thread hop** for an 80-core processor, in IEEE International Solid-State Circuits Conference, San Francisco, 2010, pp. 174–175
Google Scholar
J. Duato, A theory of fault-tolerant routing in wormhole networks, in International Conference on Parallel and Distributed Systems, Hsinchu, 1994, pp. 600–607
Google Scholar
J. Duato, A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans. Parallel Distrib. Syst. 6(10), 1055–1067 (1995)
Article Google Scholar
J. Flich, A. Mejia, P. López, J. Duato, Region-based routing: an efficient routing mechanism to tackle unreliable hardware in Network-on-Chips, in International Symposium on Networks-on-Chip (NoCS-2007), Princeton, 2007
Google Scholar
J. Flich, S. Rodrigo, J. Duato, An efficient implementation of distributed routing algorithms for NoCs, in International Symposium on Network-on-Chips, Newcastle, 2008
Google Scholar
M. Galles, Spider: a high-speed network interconnect. IEEE Micro 17(1), 34–39 (1997)
Article MathSciNet Google Scholar
C.J. Glass, L.M. Ni, The turn model for adaptive routing. J. ACM (JACM) 41(5), 874–902 (1994)
Google Scholar
Intel Xeon Phi Coprocessor 5110P, Highly parallel processing to power your breakthrough innovations. Weblink, http://www.intel.com/content/www/us/en/processors/xeon/xeonphi-detail.html
Y. Tamir, G.L. Frazier, Dynamically-allocated multi-queue buffers for VLSI communication switches. IEEE Trans. Comput. 41(6), 725–737 (1992)
Article Google Scholar
A.S. Vaidya, A. Sivasubramaniam, C.R. Das, LAPSES: a recipe for high performance adaptive router design, in Proceedings of the 5th International Symposium on High Performance Computer Architecture (HPCA’99), Orlando (IEEE Computer Society, Washington, DC, 1999), pp. 236–243
Google Scholar

Download references

Acknowledgements

Contributions and insights provided at various points in time by the following individuals are gratefully acknowledged: Donglai Dai, Dongkook Park, Andres Mejia, Gaspar Mora Porta, Roy Saharoy, Jay Jayasimha, Partha Kundu, Mani Ayyar and the late David James.

Author information

Authors and Affiliations

Nvidia Corporation, 2701 San Tomas Expy, Santa Clara, CA, 95050, USA
Aniruddha S. Vaidya
Intel Corporation, 2200 Mission College Blvd, Santa Clara, CA, 95052, USA
Mani Azimi & Akhilesh Kumar

Authors

Aniruddha S. Vaidya
View author publications
You can also search for this author in PubMed Google Scholar
Mani Azimi
View author publications
You can also search for this author in PubMed Google Scholar
Akhilesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aniruddha S. Vaidya .

Editor information

Editors and Affiliations

Facoltà di Ingegneria, Università degli Studi di Enna, 'Kore', Enna, Italy
Maurizio Palesi
Department of IT, University of Turku, Turku, Finland
Masoud Daneshtalab

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vaidya, A.S., Azimi, M., Kumar, A. (2014). On Chip Network Routing for Tera-Scale Architectures. In: Palesi, M., Daneshtalab, M. (eds) Routing Algorithms in Networks-on-Chip. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8274-1_14

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8274-1_14
Published: 25 September 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8273-4
Online ISBN: 978-1-4614-8274-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

On Chip Network Routing for Tera-Scale Architectures

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

RiCoBiT—Ring Connected Binary Tree: A Structured and Scalable Architecture for Network-on-Chip Based Systems: an Exclusive Summary

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On Chip Network Routing for Tera-Scale Architectures

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect

Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors

RiCoBiT—Ring Connected Binary Tree: A Structured and Scalable Architecture for Network-on-Chip Based Systems: an Exclusive Summary

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation