Abstract
This chapter is devoted to the development of design means for high-performance heterogeneous ICs, which will eliminate the shortcomings in the operation of circuits and provide increased performance and make these circuits universal.
The principles of develo** design means for high-performance heterogeneous integrated circuits were proposed, which significantly improve their main technical parameters, performance, and data transmission mechanisms between components and reduce design time.
A method improved the means of data transmission between components in high-performance heterogeneous integrated circuits, which, due to modified architecture, provides reduction in the number of data bits eight times, by increasing the used area in the core by 2.25%.
A method has been developed to improve the means of data transmission between clock domains in high-performance heterogeneous integrated circuits, which, due to mixed-signal architecture, provides delay decrease at least 50% due to an increase in the occupied area by an average of 21%.
A method was proposed for implementing the architecture of heterogeneous integrated circuits, which, due to a scheduler, memory management unit, direct memory access, and a special command set, provides a 32.48% increase in speed due to an increase of area by 11%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Y. Li, X. Zhao, T. Cheng, Heterogeneous computing platform based on CPU+FPGA and working modes. 2016 12th International conference on computational intelligence and security (CIS) (2016), pp. 669–672
K. Rupp, Microprocessor trend data (2022). https://github.com/karlrupp/microprocessor-trend-data/tree/master/50yrs
M. Gianfagna, What is Moore’s law? (2021), https://www.synopsys.com/glossary/what-is-moores-law.html#:~:text=Definition,as E %3D mc2)
M.H. Scaling, Power, and the future of CMOS technology. Device research conference (2008), pp. 7–8
F. Juan, F. Qingwen, H. **aoting, et al., Performance optimization by dynamically altering cache replacement algorithm in CPU-GPU heterogeneous multi-core architecture. 2017 17th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGRID) (2017), pp. 723–726
S. Vijayalakshmi, A. Alagan, D.P. Kothari, Power-performance of multi-threaded multi-core processor: analysis, optimization and simulation. 2013 international conference on high performance computing & simulation (HPCS) (2013), pp. 674–677
M. Diogo, D. Helder, S. Leonel, I. Aleksandar, Analyzing performance of multi-cores and applications with cache-aware Roofline Model. 2017 international conference on high performance computing & simulation (HPCS) (2017), pp. 933–934
R. Ritesh, K. Neeharika, R. Nitin, Digital image processing through parallel computing in single-core and multi-core systems using MATLAB. 2017 2nd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT) (2017), pp. 462–465
L. Duk Hyung, C. Hyun Hak, J. Ok Hyun. Analysis of power, temperature, and performance on mobile application processor. International conference on mechatronics, robotics and systems engineering (MoRSE) (2019), pp. 81–85
W. Siqi, A. Gayathri, M. Tulika, OPTiC: Optimizing collaborative CPU–GPU computing on mobile devices with thermal constraints. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 38(3), 393–406 (2019)
Jayant, V. Shahi, C.M. Velpula, CPU temperature aware scheduler a study on incorporating temperature data for CPU scheduling decisions. 2015 international conference on advances in computing, communications and informatics (ICACCI) (2015), pp. 2409–2413
2021 Trends. https://static1.squarespace.com/static/6130ef779c7a2574bd4b8888/t/616c79ed5a30e36825f47818/1634499069232/isscc2021.press_kit_110620.pdf. Institute of Electrical and Electronics Engineers – University of Pennsylvania (2021), pp. 1–152
B. Shekhar, C.A. Andrew, The future of microprocessors. Commun. ACM 54(5), 67–77 (2011)
B. Shekhar, Thousand Core Chips—A Technology Perspective (Intel Corp, Microprocessor Technology Lab, Hillsboro, 2012), pp. 746–749
White Paper, Next leap in microprocessor architecture: Intel® Core™ duo processor (2006), p. 4
A.R.A. Saif, K. Bin Jumari, Performance study of Core2Duo desktop processors. 2009 International conference on electrical engineering and informatics (2009), pp. 532–536
M.D. Hill, Amdahl’s law in the multicore era. 2008 IEEE 14th international symposium on high performance computer architecture (2008) vol. 41, no. 7, pp. 33–38
B. Rubén, B. Daniele, B. Andrea, A. Giovanni, et al., A synchronization-based hybrid-memory multi-core architecture for energy-efficient biomedical signal processing. IEEE Trans. Comput. 66(4), 575–585 (2017)
K. Takanori, L. Yamin, A cost and performance analytical model for large-scale on-chip interconnection networks. 2016 4th international symposium on computing and networking (CANDAR) (2016), pp. 447–450
M.J. Cade, A. Qasem, Balancing locality and parallelism on shared-cache mulit-core systems. 2009 11th IEEE international conference on high performance computing and communications (HPCC 2009) (2009), pp. 188–195. https://doi.org/10.1109/HPCC.2009.61
J. Ma, C. Hao, W. Zhang, T. Yoshimura, Power-efficient partitioning and cluster generation design for application-specific network-on-chip. 2016 international SoC design conference: smart SoC for intelligent things (ISOCC) (2016), pp. 83–84. https://doi.org/10.1109/ISOCC.2016.7799744
K. Onur, N. Nachiappan Chidambaram, J. Adwait, A. Rachata, Managing GPU concurrency in heterogeneous architectures. 2014 47th annual IEEE/ACM international symposium on microarchitecture (2014), pp. 114–126
J. Choquette, W. Gandhi, O. Giroux, et al., NVIDIA A100 tensor Core GPU: Performance and innovation. IEEE Micro. 41(2), 29–35 (2021). https://doi.org/10.1109/MM.2021.3061394
F.L. Yuan, C.C. Wang, T.H. Yu, D. Marković, A multi-granularity FPGA with hierarchical interconnects for efficient and flexible Mobile computing. IEEE J. Solid State Circuits 50(1), 137–149 (2015). https://doi.org/10.1109/JSSC.2014.2372034
Z. Lai, K.T. Lam, C.L. Wang, J. Su, A power modelling approach for many-core architectures. Proceedings of the 2014 10th international conference on semantics, knowledge and grids (SKG-2014) (2014), pp. 128–132. https://doi.org/10.1109/SKG.2014.10
F. Conti, C. Pilkington, A. Marongiu, L. Benini, He-P2012: Architectural heterogeneity exploration on a scalable many-core platform. Proceedings of the ACM great lakes symposium on VLSI, (GLSVLSI) (2014), pp. 231–232. https://doi.org/10.1145/2591513.2591553
W.P. Huang, R.C.C. Cheung, H. Yan, An efficient application specific instruction set processor (ASIP) for tensor computation. Proceedings of the international conference on application-specific systems, architectures and processors, vol. 2019 (2019), p. 37. https://doi.org/10.1109/ASAP.2019.00-36
H. Anwar, M. Daneshtalab, M. Ebrahimi, M. Ramirez, et al Integration of AES on heterogeneous many-core system. Proceedings of the 2014 22nd euromicro international conference on parallel, distributed, and network-based processing, (PDP 2014) (2014), pp. 424–427. https://doi.org/10.1109/PDP.2014.86
H.-J. Wunderlich, Simulation on reconfigurable heterogeneous computer architectures (2017), https://www.iti.uni-stuttgart.de/en/chairs/ca/projects/oldprojects/simtech/
A.Z. Adamov, Computation model of data intensive computing with MapReduce. Proceedings of the 14th IEEE international conference on application of information and communication technologies (AICT-2020) (2020), pp. 1–5. https://doi.org/10.1109/AICT50176.2020.9368841
M. Davari, A. Ros, E. Hagersten, S. Kaxiras, An efficient, self-contained, on-chip directory: DIR1-SISD. Parallel architectures and compilation techniques – Conference proceedings (PACT) (2015), pp. 317–330. https://doi.org/10.1109/PACT.2015.23
I. Yamazaki, J. Kurzak, P. Luszczek, J. Dongarra, Design and implementation of a large scale tree-based QR decomposition using a 3D virtual systolic array and a lightweight runtime. Proceedings of the IEEE 28th international parallel and distributed processing symposium workshops (IPDPSW-2014) (2014), pp. 1495–1504. https://doi.org/10.1109/IPDPSW.2014.167
M.T. Sim, Q. Yi, An adaptive multitasking superscalar processor. 2019 IEEE 5th International conference on computer and communications (ICCC 2019) (2019), pp. 1293–1299. https://doi.org/10.1109/ICCC47050.2019.9064185
S. Processors, Superscalar processor: Intro (1995). No. 7, pp. 1–19. https://en.wikipedia.org/wiki/Superscalar_processor
SISD, SIMD, MISD, MIMD. https://learnlearn.uk/alevelcs/sisd-simd-misd-mimd/
J. Chen, C. Yang, Optimizing SIMD parallel computation with non-consecutive array access in inline SSE assembly language. Proceedings of the 2012 5th international conference on intelligent computation technology and automation (ICICTA-2012) (2012), pp. 254–257. https://doi.org/10.1109/ICICTA.2012.70
B.S. Mahmood, M.A.A. Jbaar, Design and implementation of SIMD vector processor on FPGA. 2011 4th international symposium on innovation in information and communication technology (ISIICT’2011) (2011), pp. 124–130. https://doi.org/10.1109/ISIICT.2011.6149607
L. Juan Gómez, M. Onur, P&S heterogeneous systems SIMD processing and GPUs (2021), pp. 1–75. https://safari.ethz.ch/projects_and_seminars/fall2021/lib/exe/fetch.php?media=p_s-hetsys-fs2021-meeting2-aftermeeting.pdf
B. Rajeshwari, K. Veena, MIMO receiver and decoder using vector processor. Proceedings/TENCON IEEE region 10 annual international conference: 2017, vol. 2017-December, pp. 1225–1230. https://doi.org/10.1109/TENCON.2017.8228044
K. Patsidis, C. Nicopoulos, G.C. Sirakoulis, G. Dimitrakopoulos, RISC-V2: A scalable RISC-V vector processor. Proceedings of the IEEE international symposium on circuits and systems, October (2020), pp. 1–5. https://doi.org/10.1109/iscas45731.2020.9181071
Y. ** method. 2009 1st international conference on information science and engineering (ICISE-2009) (2009), pp. 95–98. https://doi.org/10.1109/ICISE.2009.203
A. Halaas, B. Svingen, M. Nedland, P. Sætrom, et al., A recursive MISD architecture for pattern matching. IEEE Trans. Very Large Scale Integr. Syst. 12(7), 727–734 (2004). https://doi.org/10.1109/TVLSI.2004.830918
A. Yazdanbakhsh, K. Samadi, N.S. Kim, H. Esmaeilzadeh, GANAX: A unified MIMD-SIMD acceleration for generative adversarial networks. Proceedings of the international symposium on computer architecture (2018), pp. 650–661. https://doi.org/10.1109/ISCA.2018.00060
S. Arrabi, D. Moore, L. Wang, K. Skadron, et al., Flexibility and circuit overheads in reconfigurable sIMD/MIMD systems. Proceedings of the 2014 IEEE 22nd international symposium on field-programmable custom computing machines (FCCM 2014) (2014), p. 236. https://doi.org/10.1109/FCCM.2014.71
Y. Yamato, N. Hoshikawa, H. Noguchi, et al., A study to optimize heterogeneous resources for open IoT. Proceedings of the 2017 5th international symposium on computing and networking (CANDAR-2017), January (2018), pp. 609–611. https://doi.org/10.1109/CANDAR.2017.16
K. Gai, L. Qiu, H. Zhao, M. Qiu, Cost-aware multimedia data allocation for heterogeneous memory using genetic algorithm in cloud computing. IEEE Trans. Cloud Comput. 8(4), 1212–1222 (2020). https://doi.org/10.1109/TCC.2016.2594172
A.R. Brodtkorb, C. Dyken, T.R. Hagen, et al., State-of-the-art in heterogeneous computing. Sci. Program. 18(1), 1–33 (2010). https://doi.org/10.3233/SPR-2009-0296
K. Zhu, Y. Ding, Research on low power scheduling of heterogeneous multi core mission based on genetic algorithm. Proceedings of the 9th international conference on measuring technology and mechatronics automation (ICMTMA-2017) (2017), pp. 219–223. https://doi.org/10.1109/ICMTMA.2017.0059
C. Yu, M. Cai, An image depth processing method based on parallel computing and multi-GPU. Proceedings of the 2nd international conference on smart electronics and communication (ICOSEC-2021) (2021), pp. 1009–1012. https://doi.org/10.1109/ICOSEC51865.2021.9591686
A.K. Gupta, A. Raman, N. Kumar, R. Ranjan, Design and implementation of high-speed universal asynchronous receiver and transmitter (UART). 2020 7th international conference on signal processing and integrated networks (SPIN-2020) (2020), pp. 295–300. https://doi.org/10.1109/SPIN48934.2020.9070856
S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, H. Khachatryan, Configurable verification IP for UART. 2020 IEEE 40th international conference on electronics and nanotechnology (ELNANO) (2020), pp. 234–237
T. Praveen Blessington, B. Bhanu Murthy, G.V. Ganesh, T.S.R. Prasad, Optimal implementation of UART-SPI interface in SoC. 2012 international conference on devices, circuits and systems, ICDCS 2012 (2012), pp. 673–677. https://doi.org/10.1109/ICDCSyst.2012.6188657
V. Melikyan, S. Harutyunyan, A. Kirakosyan, T. Kaplanyan, UVM verification IP for AXI. 2021 IEEE east-west design and test symposium, (EWDTS-2021) (2021), pp. 1–4. https://doi.org/10.1109/EWDTS52692.2021.9580997
J. Liu, M. Hong, K. Do, J.Y. Choi, et al. Clock domain crossing aware sequential clock gating. Design, automation & test in Europe conference & exhibition (DATE) (2015), pp. 1–6
S. Hatture, S. Dhage, Open loop and closed loop solution for clock domain crossing faults. Global conference on communication technologies (GCCT-2015) (2015), pp. 645–649. https://doi.org/10.1109/GCCT.2015.7342741
D. Basu, D.K. Kole, H. Rahaman, Implementation of AES algorithm in UART module for secured data transfer. Proceedings of 2012 international conference on advances in computing and communications (ICACC-2012) (2012), pp. 142–145. https://doi.org/10.1109/ICACC.2012.32
B. Zhang, K. Zhang, J. Zhu, X. Li, UART interface design based on DM642 video surveillance system and wireless network module. Proceedings of 2011 IEEE 2nd international conference on software engineering and service science (ICSESS-2011) (2011), pp. 477–480. https://doi.org/10.1109/ICSESS.2011.5982357
KeyStone architecture: Universal asynchronous receiver/transmitter (UART). Texas Instruments (2010), pp. 1–51
J.H. Hong, S.W. Han, E.Y. Chung, A RAM cache approach using host memory buffer of the NVMe interface. International SoC design conference: Smart SoC for intelligent things (ISOCC-2016). (2016), pp. 109–110. https://doi.org/10.1109/ISOCC.2016.7799757
D. Akash, M. Kishore, Mohana, K.H. Basha, Interfacing of flash memory and DDR3 RAM memory with Kintex 7 FPGA board. Proceedings of the 2nd IEEE international conference on recent trends in electronics, information and communication technology (RTEICT-2017) proceedings, January (2017), pp. 2006–2010. https://doi.org/10.1109/RTEICT.2017.8256950
S. Zhou, T. Zhang, Y. Yang, cross clock domain signal research based on dynamic motivation model. Proceedings of the 4th international conference on dependable systems and their applications. (DSA-2017), January (2017), p. 156. https://doi.org/10.1109/DSA.2017.34
N. Karimi, K. Chakrabarty, Detection, diagnosis, and recovery from clock-domain crossing failures in multiclock SoCs. IEEE Trans. Comput.-Aided Design Integra. Circuits Syst. 32(9), 1395–1408 (2013). https://doi.org/10.1109/TCAD.2013.2255127
V. Melikyan, S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, et al., Design and verification of novel sync cell. Proceedings of the 2021 IEEE east-west design and test symposium, (EWDTS-2021) (2021). pp. 1–5. https://doi.org/10.1109/EWDTS52692.2021.9580985
C.E. Cummings, Clock domain crossing (CDC) design & verification techniques using system Verilog. Techniques (2008), No. Cdc. pp. 1–56
M. BartÃk, Clock domain crossing – An advanced course for future digital design engineers. Proceedings of the 2018 7th mediterranean conference on embedded computing (MECO-2018) – Including ECYPS-2018 (2018), pp. 1–5. https://doi.org/10.1109/MECO.2018.8406004
S. Beer, R. Ginosar, R. Dobkin, Y. Weizman, MTBF estimation in coherent clock domains. Proceedings of the international symposium on asynchronous circuits and systems (2013), pp. 166–173. https://doi.org/10.1109/ASYNC.2013.19
ASIP Designer (2021), https://www.synopsys.com/dw/doc.php/ds/cc/asip-brochure.pdf
T. Sato, S. Chivapreecha, P. Moungnoul, K. Higuchi, An FPGA architecture for ASIC-FPGA co-design to streamline. Process. IDSs. 412–417 (2017). https://doi.org/10.1109/cts.2016.0079
A.S. Hussein, H. Mostafa, ASIC-FPGA gap for a RISC-V core implementation for DNN applications. Proceedings of the 3rd novel intelligent and leading emerging sciences conference (NILES-2021) (2021), pp. 385–388. https://doi.org/10.1109/NILES53778.2021.9600503
The OpenCL specification. Khronos OpenCL working Group (2019). https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html
V. Mekkat, A. Holey, P.C. Yew, A. Zhai, Managing shared last-level cache in a heterogeneous multicore processor. Parallel architectures and compilation techniques – Conference proceedings (PACT) (2013), pp. 225–234. https://doi.org/10.1109/PACT.2013.6618819
S. Harutyunyan, T. Kaplanyan, A. Kirakosyan, A. Momjyan, Design and verification of autoconfigurable UART controller. Proceedings of the 2020 IEEE 40th international conference on electronics and nanotechnology (ELNANO-2020) (2020), pp. 347–350. https://doi.org/10.1109/ELNANO50318.2020.9088789
T.K. Kaplanyan, A novel pulse synchronizer design with the proposed sync cell model. Proc. RA NAS NPUA Ser. Tech. Sci. 74(4), 464–470 (2021)
V.Sh. Melikyan, M. Martirosyan, A. Melikyan, G. Piliposyan. 14nm educational design kit: Capabilities, deployment and future. Proceedings of the 7th small systems simulation symposium 2018, Niš, Serbia, February 12–14 (2018), pp. 37–41
T.K. Kaplanyan, L.A. Mikaelyan, A.A. Petrosyan, A.M. Momjyan, et al, Design of video processing platform with interchangeable input-output interfaces. 2019 IEEE 39th international conference on electronics and nanotechnology: Proceedings (ELNANO-2019) (2019), pp. 201–205. https://doi.org/10.1109/ELNANO.2019.8783420
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Melikyan, V. (2024). Design of High-performance Heterogeneous Integrated Circuits. In: Machine Learning-based Design and Optimization of High-Speed Circuits. Springer, Cham. https://doi.org/10.1007/978-3-031-50714-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-50714-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50713-7
Online ISBN: 978-3-031-50714-4
eBook Packages: EngineeringEngineering (R0)