Design of High-performance Heterogeneous Integrated Circuits

  • Chapter
  • First Online:
Machine Learning-based Design and Optimization of High-Speed Circuits
  • 218 Accesses

Abstract

This chapter is devoted to the development of design means for high-performance heterogeneous ICs, which will eliminate the shortcomings in the operation of circuits and provide increased performance and make these circuits universal.

The principles of develo** design means for high-performance heterogeneous integrated circuits were proposed, which significantly improve their main technical parameters, performance, and data transmission mechanisms between components and reduce design time.

A method improved the means of data transmission between components in high-performance heterogeneous integrated circuits, which, due to modified architecture, provides reduction in the number of data bits eight times, by increasing the used area in the core by 2.25%.

A method has been developed to improve the means of data transmission between clock domains in high-performance heterogeneous integrated circuits, which, due to mixed-signal architecture, provides delay decrease at least 50% due to an increase in the occupied area by an average of 21%.

A method was proposed for implementing the architecture of heterogeneous integrated circuits, which, due to a scheduler, memory management unit, direct memory access, and a special command set, provides a 32.48% increase in speed due to an increase of area by 11%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 117.69
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Y. Li, X. Zhao, T. Cheng, Heterogeneous computing platform based on CPU+FPGA and working modes. 2016 12th International conference on computational intelligence and security (CIS) (2016), pp. 669–672

    Google Scholar 

  2. K. Rupp, Microprocessor trend data (2022). https://github.com/karlrupp/microprocessor-trend-data/tree/master/50yrs

  3. M. Gianfagna, What is Moore’s law? (2021), https://www.synopsys.com/glossary/what-is-moores-law.html#:~:text=Definition,as E %3D mc2)

  4. M.H. Scaling, Power, and the future of CMOS technology. Device research conference (2008), pp. 7–8

    Google Scholar 

  5. F. Juan, F. Qingwen, H. **aoting, et al., Performance optimization by dynamically altering cache replacement algorithm in CPU-GPU heterogeneous multi-core architecture. 2017 17th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGRID) (2017), pp. 723–726

    Google Scholar 

  6. S. Vijayalakshmi, A. Alagan, D.P. Kothari, Power-performance of multi-threaded multi-core processor: analysis, optimization and simulation. 2013 international conference on high performance computing & simulation (HPCS) (2013), pp. 674–677

    Google Scholar 

  7. M. Diogo, D. Helder, S. Leonel, I. Aleksandar, Analyzing performance of multi-cores and applications with cache-aware Roofline Model. 2017 international conference on high performance computing & simulation (HPCS) (2017), pp. 933–934

    Google Scholar 

  8. R. Ritesh, K. Neeharika, R. Nitin, Digital image processing through parallel computing in single-core and multi-core systems using MATLAB. 2017 2nd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT) (2017), pp. 462–465

    Google Scholar 

  9. L. Duk Hyung, C. Hyun Hak, J. Ok Hyun. Analysis of power, temperature, and performance on mobile application processor. International conference on mechatronics, robotics and systems engineering (MoRSE) (2019), pp. 81–85

    Google Scholar 

  10. W. Siqi, A. Gayathri, M. Tulika, OPTiC: Optimizing collaborative CPU–GPU computing on mobile devices with thermal constraints. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 38(3), 393–406 (2019)

    Article  Google Scholar 

  11. Jayant, V. Shahi, C.M. Velpula, CPU temperature aware scheduler a study on incorporating temperature data for CPU scheduling decisions. 2015 international conference on advances in computing, communications and informatics (ICACCI) (2015), pp. 2409–2413

    Google Scholar 

  12. 2021 Trends. https://static1.squarespace.com/static/6130ef779c7a2574bd4b8888/t/616c79ed5a30e36825f47818/1634499069232/isscc2021.press_kit_110620.pdf. Institute of Electrical and Electronics Engineers – University of Pennsylvania (2021), pp. 1–152

  13. B. Shekhar, C.A. Andrew, The future of microprocessors. Commun. ACM 54(5), 67–77 (2011)

    Article  Google Scholar 

  14. B. Shekhar, Thousand Core Chips—A Technology Perspective (Intel Corp, Microprocessor Technology Lab, Hillsboro, 2012), pp. 746–749

    Google Scholar 

  15. White Paper, Next leap in microprocessor architecture: Intel® Core™ duo processor (2006), p. 4

    Google Scholar 

  16. A.R.A. Saif, K. Bin Jumari, Performance study of Core2Duo desktop processors. 2009 International conference on electrical engineering and informatics (2009), pp. 532–536

    Google Scholar 

  17. M.D. Hill, Amdahl’s law in the multicore era. 2008 IEEE 14th international symposium on high performance computer architecture (2008) vol. 41, no. 7, pp. 33–38

    Google Scholar 

  18. B. Rubén, B. Daniele, B. Andrea, A. Giovanni, et al., A synchronization-based hybrid-memory multi-core architecture for energy-efficient biomedical signal processing. IEEE Trans. Comput. 66(4), 575–585 (2017)

    Article  MathSciNet  Google Scholar 

  19. K. Takanori, L. Yamin, A cost and performance analytical model for large-scale on-chip interconnection networks. 2016 4th international symposium on computing and networking (CANDAR) (2016), pp. 447–450

    Google Scholar 

  20. M.J. Cade, A. Qasem, Balancing locality and parallelism on shared-cache mulit-core systems. 2009 11th IEEE international conference on high performance computing and communications (HPCC 2009) (2009), pp. 188–195. https://doi.org/10.1109/HPCC.2009.61

  21. J. Ma, C. Hao, W. Zhang, T. Yoshimura, Power-efficient partitioning and cluster generation design for application-specific network-on-chip. 2016 international SoC design conference: smart SoC for intelligent things (ISOCC) (2016), pp. 83–84. https://doi.org/10.1109/ISOCC.2016.7799744

  22. K. Onur, N. Nachiappan Chidambaram, J. Adwait, A. Rachata, Managing GPU concurrency in heterogeneous architectures. 2014 47th annual IEEE/ACM international symposium on microarchitecture (2014), pp. 114–126

    Google Scholar 

  23. J. Choquette, W. Gandhi, O. Giroux, et al., NVIDIA A100 tensor Core GPU: Performance and innovation. IEEE Micro. 41(2), 29–35 (2021). https://doi.org/10.1109/MM.2021.3061394

    Article  Google Scholar 

  24. F.L. Yuan, C.C. Wang, T.H. Yu, D. Marković, A multi-granularity FPGA with hierarchical interconnects for efficient and flexible Mobile computing. IEEE J. Solid State Circuits 50(1), 137–149 (2015). https://doi.org/10.1109/JSSC.2014.2372034

    Article  Google Scholar 

  25. Z. Lai, K.T. Lam, C.L. Wang, J. Su, A power modelling approach for many-core architectures. Proceedings of the 2014 10th international conference on semantics, knowledge and grids (SKG-2014) (2014), pp. 128–132. https://doi.org/10.1109/SKG.2014.10

  26. F. Conti, C. Pilkington, A. Marongiu, L. Benini, He-P2012: Architectural heterogeneity exploration on a scalable many-core platform. Proceedings of the ACM great lakes symposium on VLSI, (GLSVLSI) (2014), pp. 231–232. https://doi.org/10.1145/2591513.2591553

  27. W.P. Huang, R.C.C. Cheung, H. Yan, An efficient application specific instruction set processor (ASIP) for tensor computation. Proceedings of the international conference on application-specific systems, architectures and processors, vol. 2019 (2019), p. 37. https://doi.org/10.1109/ASAP.2019.00-36

  28. H. Anwar, M. Daneshtalab, M. Ebrahimi, M. Ramirez, et al Integration of AES on heterogeneous many-core system. Proceedings of the 2014 22nd euromicro international conference on parallel, distributed, and network-based processing, (PDP 2014) (2014), pp. 424–427. https://doi.org/10.1109/PDP.2014.86

  29. H.-J. Wunderlich, Simulation on reconfigurable heterogeneous computer architectures (2017), https://www.iti.uni-stuttgart.de/en/chairs/ca/projects/oldprojects/simtech/

  30. A.Z. Adamov, Computation model of data intensive computing with MapReduce. Proceedings of the 14th IEEE international conference on application of information and communication technologies (AICT-2020) (2020), pp. 1–5. https://doi.org/10.1109/AICT50176.2020.9368841

  31. M. Davari, A. Ros, E. Hagersten, S. Kaxiras, An efficient, self-contained, on-chip directory: DIR1-SISD. Parallel architectures and compilation techniques – Conference proceedings (PACT) (2015), pp. 317–330. https://doi.org/10.1109/PACT.2015.23

  32. I. Yamazaki, J. Kurzak, P. Luszczek, J. Dongarra, Design and implementation of a large scale tree-based QR decomposition using a 3D virtual systolic array and a lightweight runtime. Proceedings of the IEEE 28th international parallel and distributed processing symposium workshops (IPDPSW-2014) (2014), pp. 1495–1504. https://doi.org/10.1109/IPDPSW.2014.167

  33. M.T. Sim, Q. Yi, An adaptive multitasking superscalar processor. 2019 IEEE 5th International conference on computer and communications (ICCC 2019) (2019), pp. 1293–1299. https://doi.org/10.1109/ICCC47050.2019.9064185

  34. S. Processors, Superscalar processor: Intro (1995). No. 7, pp. 1–19. https://en.wikipedia.org/wiki/Superscalar_processor

  35. SISD, SIMD, MISD, MIMD. https://learnlearn.uk/alevelcs/sisd-simd-misd-mimd/

  36. J. Chen, C. Yang, Optimizing SIMD parallel computation with non-consecutive array access in inline SSE assembly language. Proceedings of the 2012 5th international conference on intelligent computation technology and automation (ICICTA-2012) (2012), pp. 254–257. https://doi.org/10.1109/ICICTA.2012.70

  37. B.S. Mahmood, M.A.A. Jbaar, Design and implementation of SIMD vector processor on FPGA. 2011 4th international symposium on innovation in information and communication technology (ISIICT’2011) (2011), pp. 124–130. https://doi.org/10.1109/ISIICT.2011.6149607

  38. L. Juan Gómez, M. Onur, P&S heterogeneous systems SIMD processing and GPUs (2021), pp. 1–75. https://safari.ethz.ch/projects_and_seminars/fall2021/lib/exe/fetch.php?media=p_s-hetsys-fs2021-meeting2-aftermeeting.pdf

  39. B. Rajeshwari, K. Veena, MIMO receiver and decoder using vector processor. Proceedings/TENCON IEEE region 10 annual international conference: 2017, vol. 2017-December, pp. 1225–1230. https://doi.org/10.1109/TENCON.2017.8228044

  40. K. Patsidis, C. Nicopoulos, G.C. Sirakoulis, G. Dimitrakopoulos, RISC-V2: A scalable RISC-V vector processor. Proceedings of the IEEE international symposium on circuits and systems, October (2020), pp. 1–5. https://doi.org/10.1109/iscas45731.2020.9181071

  41. Y. ** method. 2009 1st international conference on information science and engineering (ICISE-2009) (2009), pp. 95–98. https://doi.org/10.1109/ICISE.2009.203

  42. A. Halaas, B. Svingen, M. Nedland, P. Sætrom, et al., A recursive MISD architecture for pattern matching. IEEE Trans. Very Large Scale Integr. Syst. 12(7), 727–734 (2004). https://doi.org/10.1109/TVLSI.2004.830918

    Article  Google Scholar 

  43. A. Yazdanbakhsh, K. Samadi, N.S. Kim, H. Esmaeilzadeh, GANAX: A unified MIMD-SIMD acceleration for generative adversarial networks. Proceedings of the international symposium on computer architecture (2018), pp. 650–661. https://doi.org/10.1109/ISCA.2018.00060

  44. S. Arrabi, D. Moore, L. Wang, K. Skadron, et al., Flexibility and circuit overheads in reconfigurable sIMD/MIMD systems. Proceedings of the 2014 IEEE 22nd international symposium on field-programmable custom computing machines (FCCM 2014) (2014), p. 236. https://doi.org/10.1109/FCCM.2014.71

  45. Y. Yamato, N. Hoshikawa, H. Noguchi, et al., A study to optimize heterogeneous resources for open IoT. Proceedings of the 2017 5th international symposium on computing and networking (CANDAR-2017), January (2018), pp. 609–611. https://doi.org/10.1109/CANDAR.2017.16

  46. K. Gai, L. Qiu, H. Zhao, M. Qiu, Cost-aware multimedia data allocation for heterogeneous memory using genetic algorithm in cloud computing. IEEE Trans. Cloud Comput. 8(4), 1212–1222 (2020). https://doi.org/10.1109/TCC.2016.2594172

    Article  Google Scholar 

  47. A.R. Brodtkorb, C. Dyken, T.R. Hagen, et al., State-of-the-art in heterogeneous computing. Sci. Program. 18(1), 1–33 (2010). https://doi.org/10.3233/SPR-2009-0296

    Article  Google Scholar 

  48. K. Zhu, Y. Ding, Research on low power scheduling of heterogeneous multi core mission based on genetic algorithm. Proceedings of the 9th international conference on measuring technology and mechatronics automation (ICMTMA-2017) (2017), pp. 219–223. https://doi.org/10.1109/ICMTMA.2017.0059

  49. C. Yu, M. Cai, An image depth processing method based on parallel computing and multi-GPU. Proceedings of the 2nd international conference on smart electronics and communication (ICOSEC-2021) (2021), pp. 1009–1012. https://doi.org/10.1109/ICOSEC51865.2021.9591686

  50. A.K. Gupta, A. Raman, N. Kumar, R. Ranjan, Design and implementation of high-speed universal asynchronous receiver and transmitter (UART). 2020 7th international conference on signal processing and integrated networks (SPIN-2020) (2020), pp. 295–300. https://doi.org/10.1109/SPIN48934.2020.9070856

  51. S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, H. Khachatryan, Configurable verification IP for UART. 2020 IEEE 40th international conference on electronics and nanotechnology (ELNANO) (2020), pp. 234–237

    Google Scholar 

  52. T. Praveen Blessington, B. Bhanu Murthy, G.V. Ganesh, T.S.R. Prasad, Optimal implementation of UART-SPI interface in SoC. 2012 international conference on devices, circuits and systems, ICDCS 2012 (2012), pp. 673–677. https://doi.org/10.1109/ICDCSyst.2012.6188657

  53. V. Melikyan, S. Harutyunyan, A. Kirakosyan, T. Kaplanyan, UVM verification IP for AXI. 2021 IEEE east-west design and test symposium, (EWDTS-2021) (2021), pp. 1–4. https://doi.org/10.1109/EWDTS52692.2021.9580997

  54. J. Liu, M. Hong, K. Do, J.Y. Choi, et al. Clock domain crossing aware sequential clock gating. Design, automation & test in Europe conference & exhibition (DATE) (2015), pp. 1–6

    Google Scholar 

  55. S. Hatture, S. Dhage, Open loop and closed loop solution for clock domain crossing faults. Global conference on communication technologies (GCCT-2015) (2015), pp. 645–649. https://doi.org/10.1109/GCCT.2015.7342741

  56. D. Basu, D.K. Kole, H. Rahaman, Implementation of AES algorithm in UART module for secured data transfer. Proceedings of 2012 international conference on advances in computing and communications (ICACC-2012) (2012), pp. 142–145. https://doi.org/10.1109/ICACC.2012.32

  57. B. Zhang, K. Zhang, J. Zhu, X. Li, UART interface design based on DM642 video surveillance system and wireless network module. Proceedings of 2011 IEEE 2nd international conference on software engineering and service science (ICSESS-2011) (2011), pp. 477–480. https://doi.org/10.1109/ICSESS.2011.5982357

  58. KeyStone architecture: Universal asynchronous receiver/transmitter (UART). Texas Instruments (2010), pp. 1–51

    Google Scholar 

  59. J.H. Hong, S.W. Han, E.Y. Chung, A RAM cache approach using host memory buffer of the NVMe interface. International SoC design conference: Smart SoC for intelligent things (ISOCC-2016). (2016), pp. 109–110. https://doi.org/10.1109/ISOCC.2016.7799757

  60. D. Akash, M. Kishore, Mohana, K.H. Basha, Interfacing of flash memory and DDR3 RAM memory with Kintex 7 FPGA board. Proceedings of the 2nd IEEE international conference on recent trends in electronics, information and communication technology (RTEICT-2017) proceedings, January (2017), pp. 2006–2010. https://doi.org/10.1109/RTEICT.2017.8256950

  61. S. Zhou, T. Zhang, Y. Yang, cross clock domain signal research based on dynamic motivation model. Proceedings of the 4th international conference on dependable systems and their applications. (DSA-2017), January (2017), p. 156. https://doi.org/10.1109/DSA.2017.34

  62. N. Karimi, K. Chakrabarty, Detection, diagnosis, and recovery from clock-domain crossing failures in multiclock SoCs. IEEE Trans. Comput.-Aided Design Integra. Circuits Syst. 32(9), 1395–1408 (2013). https://doi.org/10.1109/TCAD.2013.2255127

    Article  Google Scholar 

  63. V. Melikyan, S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, et al., Design and verification of novel sync cell. Proceedings of the 2021 IEEE east-west design and test symposium, (EWDTS-2021) (2021). pp. 1–5. https://doi.org/10.1109/EWDTS52692.2021.9580985

  64. C.E. Cummings, Clock domain crossing (CDC) design & verification techniques using system Verilog. Techniques (2008), No. Cdc. pp. 1–56

    Google Scholar 

  65. M. Bartík, Clock domain crossing – An advanced course for future digital design engineers. Proceedings of the 2018 7th mediterranean conference on embedded computing (MECO-2018) – Including ECYPS-2018 (2018), pp. 1–5. https://doi.org/10.1109/MECO.2018.8406004

  66. S. Beer, R. Ginosar, R. Dobkin, Y. Weizman, MTBF estimation in coherent clock domains. Proceedings of the international symposium on asynchronous circuits and systems (2013), pp. 166–173. https://doi.org/10.1109/ASYNC.2013.19

  67. ASIP Designer (2021), https://www.synopsys.com/dw/doc.php/ds/cc/asip-brochure.pdf

  68. T. Sato, S. Chivapreecha, P. Moungnoul, K. Higuchi, An FPGA architecture for ASIC-FPGA co-design to streamline. Process. IDSs. 412–417 (2017). https://doi.org/10.1109/cts.2016.0079

  69. A.S. Hussein, H. Mostafa, ASIC-FPGA gap for a RISC-V core implementation for DNN applications. Proceedings of the 3rd novel intelligent and leading emerging sciences conference (NILES-2021) (2021), pp. 385–388. https://doi.org/10.1109/NILES53778.2021.9600503

  70. The OpenCL specification. Khronos OpenCL working Group (2019). https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html

  71. V. Mekkat, A. Holey, P.C. Yew, A. Zhai, Managing shared last-level cache in a heterogeneous multicore processor. Parallel architectures and compilation techniques – Conference proceedings (PACT) (2013), pp. 225–234. https://doi.org/10.1109/PACT.2013.6618819

  72. S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, A. Momjyan, Design and verification of autoconfigurable UART controller. Proceedings of the 2020 IEEE 40th international conference on electronics and nanotechnology (ELNANO-2020) (2020), pp. 347–350. https://doi.org/10.1109/ELNANO50318.2020.9088789

  73. T.K. Kaplanyan, A novel pulse synchronizer design with the proposed sync cell model. Proc. RA NAS NPUA Ser. Tech. Sci. 74(4), 464–470 (2021)

    Google Scholar 

  74. V.Sh. Melikyan, M. Martirosyan, A. Melikyan, G. Piliposyan. 14nm educational design kit: Capabilities, deployment and future. Proceedings of the 7th small systems simulation symposium 2018, Niš, Serbia, February 12–14 (2018), pp. 37–41

    Google Scholar 

  75. T.K. Kaplanyan, L.A. Mikaelyan, A.A. Petrosyan, A.M. Momjyan, et al, Design of video processing platform with interchangeable input-output interfaces. 2019 IEEE 39th international conference on electronics and nanotechnology: Proceedings (ELNANO-2019) (2019), pp. 201–205. https://doi.org/10.1109/ELNANO.2019.8783420

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Melikyan, V. (2024). Design of High-performance Heterogeneous Integrated Circuits. In: Machine Learning-based Design and Optimization of High-Speed Circuits. Springer, Cham. https://doi.org/10.1007/978-3-031-50714-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50714-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50713-7

  • Online ISBN: 978-3-031-50714-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation