Abstract
Artificial intelligence (AI) algorithms are extremely computational-intensive on voluminous data. AI accelerators are desired to satisfy their hardware demands. This chapter introduces the concepts in AI algorithms from a hardware point of view and provides their hardware requirements. Further, it comprehends the insights of various processor design approaches and discusses their pros and cons. This way, it is organized to bridge the gap between AI experts and hardware designers. Further, it introduces a few emerging approaches to elevate the overall performance of AI accelerators. This chapter also summarizes the entire book organized with the themes of platform-based, technological development-based, emerging application-based AI accelerators and their current issue.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks. Synth. Lect. Comput. Archit. 15(2), 1–341 (2020)
Injadat, M., Moubayed, A., Nassif, A.B., Shami, A.: Machine learning towards intelligent systems: Applications, challenges, and opportunities. Artif. Intell. Rev. 54(5), 3299–3348 (2021)
Rosendo, D., Costan, A., Valduriez, P., Antoniu, G.: Distributed intelligence on the edge-to-cloud continuum: A systematic literature review. J. Parallel Distrib. Comput. 166, 71–94 (2022)
Akhoon, M.S., Suandi, S.A., Alshahrani, A., Saad, A.M.H., Albogamy, F.R., Abdullah, M.Z.B., Loan, S.A.: High performance accelerators for deep neural networks: A review. Expert. Syst. 39(1), e12831 (2022)
Silva, G.A.: A new frontier: The convergence of nanotechnology, brain machine interfaces, and artificial intelligence. Front. Neurosci. 12, 843 (2018)
Janapa Reddi, V., Kanter, D., Mattson, P., Duke, J., Nguyen, T., Chukka, R., Shiring, K., Tan, K.S., Charlebois, M., Chou, W., El-Khamy, M.: MLPerf mobile inference benchmark: An industry-standard open-source machine learning benchmark for on-device AI. Proc. Mach. Learn. Syst. 4, 352–369 (2022)
Su, W., Li, L., Liu, F., He, M., Liang, X.: AI on the edge: a comprehensive review. In: Artificial Intelligence Review 55, 6125–6183. Springer (2022). https://doi.org/10.1007/s10462-022-10141-4
Vyas, L.: “New normal” at work in a post-COVID world: Work–life balance and labor markets. Policy Soc. 41, 155–167 (2022)
Mishra, A., Kim, J., Cha, J., Kim, D., Kim, S.: Authorized traffic controller hand gesture recognition for situation-aware autonomous driving. Sensors. 21(23), 7914 (2021)
Mishra, A., Lee, S., Kim, D., Kim, S.: In-cabin monitoring system for autonomous vehicles. Sensors. 22(12), 4360 (2022)
Mishra, A., Cha, J., Kim, S.: HCI based in-cabin monitoring system for irregular situations with occupants facial anonymization. In: International Conference on Intelligent Human Computer Interaction, pp. 380–390. Springer, Cham (2020)
Mishra, A., Cha, J., Kim, S.: Privacy-preserved in-cabin monitoring system for autonomous vehicles. Comput. Intell. Neurosci. 2022, 1 (2022)
Jhung, J., Kim, S.: Behind-the-scenes (Bts): Wiper-occlusion canceling for advanced driver assistance systems in adverse rain environments. Sensors. 21(23), 8081 (2021)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge, United States (2016)
What is the convolutional neural network architecture?: https://www.analyticsvidhya.com/blog/2020/10/what-is-the-convolutional-neural-network-architecture
Zhang, C., Lu, Y.: Study on artificial intelligence: The state of the art and future prospects. J. Ind. Inf. Integr. 23, 100224 (2021)
Choi, S., Sim, J., Kang, M., Choi, Y., Kim, H., Kim, L.S.: An energy-efficient deep convolutional neural network training accelerator for in situ personalization on smart devices. IEEE J. Solid State Circuits. 55(10), 2691–2702 (2020)
Song, L., Qian, X., Li, H., Chen, Y.: Pipelayer: A pipelined reram-based accelerator for deep learning. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 541–552. IEEE (2017)
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943)
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Widrow, B., Hoff, M.E.: Adaptive Switching Circuits. Stanford Univ Ca Stanford Electronics Labs, United States (1960)
Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. The MIT Press, Cambridge, MA (1969)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature. 323(6088), 533–536 (1986)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Famous graphics chips: Nvidia’s GeForce 256, https://www.computer.org/publications/tech-news/chasing-pixels/nvidias-geforce-256
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science. 313(5786), 504–507 (2006)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Settles, B.: Active Learning Literature Survey. University of Wisconsin-Madison, United States (2009)
Konyushkova, K., Sznitman, R., Fua, P.: Learning active learning from data. Adv. Neural Inf. Proces. Syst. 30 (2017). https://proceedings.neurips.cc/paper/2017/file/8ca8da41fe1ebc8d3ca31dc14f5fc56c-Paper.pdf
Ghai, B., Liao, Q.V., Zhang, Y., Bellamy, R., Mueller, K.: Explainable active learning (xal) toward ai explanations as interfaces for machine teachers. Proc. ACM Hum.-Comput. Interact. 4(CSCW3), 1–28 (2021)
Anahideh, H., Asudeh, A., Thirumuruganathan, S.: Fair active learning. Expert Syst. Appl. 199, 116981 (2022)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Yang, Q., Liu, Y., Cheng, Y., Kang, Y., Chen, T., Yu, H.: Federated learning. Synth. Lect. Artif. Intell. Mach. Learn. 13(3), 1–207 (2019)
Li, T., Sahu, A.K., Talwalkar, A., Smith, V.: Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 37(3), 50–60 (2020)
Zhang, C., **e, Y., Bai, H., Yu, B., Li, W., Gao, Y.: A survey on federated learning. Knowl.-Based Syst. 216, 106775 (2021)
Artificial intelligence computing for consumer: Market and Technology Report. https://s3.i-micronews.com/uploads/2019/10/Yole_YD19045_Artificial-Intelligence-Computing-for-Consumer_October_2019_Sample.pdf
Bavikadi, S., Dhavlle, A., Ganguly, A., Haridass, A., Hendy, H., Merkel, C., Reddi, V.J., Sutradhar, P.R., Joseph, A., Dinakarrao, S.M.P.: A survey on machine learning accelerators and evolutionary hardware platforms. IEEE Des. Test. 39(3), 91–116 (2022)
Machupalli, R., Hossain, M., Mandal, M.: Review of ASIC accelerators for deep neural network. Microprocess. Microsyst. 89, 104441 (2022)
Tao, Y.: Algorithm-architecture co-design for domain-specific accelerators in communication and artificial intelligence. Doctoral Dissertation (2022)
Du, L., Du, Y.: Hardware accelerator design for machine learning. Mach. Learn.-Adv. Tech. Emerg. Appl., 1–14 (2017)
Batra, G., Jacobson, Z., Madhav, S., Queirolo, A., Santhanam, N.: Artificial-Intelligence Hardware: New Opportunities for Semiconductor Companies. McKinsey Co, United States (2018). https://www.mckinsey.com/~/media/McKinsey/Industries/Semiconductors/Our%20Insights/Artificial%20intelligence%20hardware%20New%20opportunities%20for%20semiconductor %20companies/Artificial-intelligence-hardware.pdf
Dally, W.J., Turakhia, Y., Han, S.: Domain-specific hardware accelerators. Commun. ACM. 63(7), 48–57 (2020)
Kim, S., Deka, G.C.: Hardware Accelerator Systems for Artificial Intelligence and Machine Learning. Academic Press, United States (2021)
Kachris, C., Falsafi, B., Soudris, D. (eds.): Hardware Accelerators in Data Centers. Springer Cham, United States (2019). https://doi.org/10.1007/978-3-319-92792-3
Talib, M.A., Majzoub, S., Nasir, Q., Jamal, D.: A systematic literature review on hardware implementation of artificial intelligence algorithms. J. Supercomput. 77(2), 1897–1938 (2021)
Keckler, S., Milojicic, D.: Accelerators. Computer. 55(1), 108–112 (2022)
Bianco, S., Cadene, R., Celona, L., Napoletano, P.: Benchmark analysis of representative deep neural network architectures. IEEE Access. 6, 64270–64277 (2018)
Patterson, D.A., Hennessy, J.L.: Computer Organization and Design ARM Edition: The Hardware Software Interface. Morgan Kaufmann, Cambridge, USA (2016)
Park, H., Kim, S.: Hardware accelerator systems for artificial intelligence and machine learning. Adv. Comput. 122, 51–95 (2021)
Park, H., Kim, D., Kim, S.: TMA: Tera-MACs/W neural hardware inference accelerator with a multiplier-less massive parallel processor. Int. J. Circuit Theory Appl. 49(5), 1399–1409 (2021)
WTF is a SIMD, SMT, SIMT: https://medium.com/@valarauca/wtf-is-a-simd-smt-simt-f9fb749f89f1
Blake, G., Dreslinski, R.G., Mudge, T.: A survey of multicore processors. IEEE Signal Process. Mag. 26(6), 26–37 (2009)
Computer hardware engineering: https://www.kth.se/social/files/54fdb2c5f276546b06f9acfb/lecture10-spp2.pdf
Simultaneous multithreading: https://www.ibm.com/docs/en/sdse/6.4.0?topic=planning-simultaneous-multithreading
Computer architecture: SIMD and GPUs (Part I): https://course.ece.cmu.edu/~ece740/f13/lib/exe/fetch.php?media=onur-740-fall13-module5.1.1-simd-and-gpus-part1.pdf
Duncan, R.: A survey of parallel computer architectures. Computer. 23(2), 5–16 (1990)
Tino, A., Collange, C., Seznec, A.: SIMT-X: Extending single-instruction multi-threading to out-of-order cores. ACM Trans. Archit. Code Optim. (TACO). 17(2), 1–23 (2020)
Aamodt, T.M., Fung, W.W.L., Rogers, T.G.: General-purpose graphics processor architectures. Synth. Lect. Comput. Archit. 13(2), 1–140 (2018)
Whitepaper-NVIDIA’s Next Generation CUDATM Compute Architecture: Fermi. https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro. 28(2), 39–55 (2008)
Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE. 105(12), 2295–2329 (2017)
Mao, W., **ao, Z., Xu, P., Ren, H., Liu, D., Zhao, S., An, F., Yu, H.: Energy-efficient machine learning accelerator for binary neural networks. In: Proceedings of the 2020 on Great Lakes Symposium on VLSI, pp. 77–82 (2020)
System Architecture: https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#device
Chen, Y., **e, Y., Song, L., Chen, F., Tang, T.: A survey of accelerator architectures for deep neural networks. Engineering. 6(3), 264–274 (2020)
Esmaeilzadeh, H., Sampson, A., Ceze, L., Burger, D.: Neural acceleration for general-purpose approximate programs. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 449–460. IEEE (2012)
Rocki, K., Van Essendelft, D., Sharapov, I., Schreiber, R., Morrison, M., Kibardin, V., Portnoy, A., Dietiker, J.F., Syamlal, M., James, M.: Fast stencil-code computation on a wafer-scale processor. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–14. IEEE (2020)
Lauterbach, G.: The path to successful wafer-scale integration: The Cerebras story. IEEE Micro. 41(6), 52–57 (2021)
Prezioso, M., Merrikh-Bayat, F., Hoskins, B., Adam, G., Likharev, K.K., Strukov, D.B.: Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature. 521(7550), 61–64 (2015)
Schuller, I.K., Stevens, R., Pino, R., Pechan, M.: Neuromorphic Computing–from Materials Research to Systems Architecture Roundtable. USDOE Office of Science (SC), United States (2015)
Pehle, C., Billaudelle, S., Cramer, B., Kaiser, J., Schreiber, K., Stradmann, Y., Weis, J., Leibfried, A., Müller, E., Schemmel, J.: The BrainScaleS-2 accelerated neuromorphic system with hybrid plasticity. Front. Neurosci. 16 (2022). https://doi.org/10.3389/fnins.2022.795876
McDonough, I.M., Haber, S., Bischof, G.N., Park, D.C.: The Synapse project: Engagement in mentally challenging activities enhances neural efficiency. Restor. Neurol. Neurosci. 33(6), 865–882 (2015)
Ambrogio, S., Narayanan, P., Tsai, H., Shelby, R.M., Boybat, I., Di Nolfo, C., Sidler, S., Giordano, M., Bodini, M., Farinha, N.C., Killeen, B.: Equivalent-accuracy accelerated neural-network training using analogue memory. Nature. 558(7708), 60–67 (2018)
Cho, K., Lee, I., Lim, H., Kang, S.: Efficient systolic-array redundancy architecture for offline/online repair. Electronics. 9(2), 338 (2020)
Capra, M., Bussolino, B., Marchisio, A., Shafique, M., Masera, G., Martina, M.: An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks. Future Internet. 12(7), 113 (2020)
Skliarova, I., Sklyarov, V.: FPGA-Based Hardware Accelerators. Springer Cham, Switzerland (2019). https://doi.org/10.1007/978-3-030-20721-2
Karras, K., Pallis, E., Mastorakis, G., Nikoloudakis, Y., Batalla, J.M., Mavromoustakis, C.X., Markakis, E.: A hardware acceleration platform for AI-based inference at the edge. Circuits Syst Signal Process. 39, 1059–1070 (2020)
Mowla, N.I., Doh, I., Chae, K.: A hardware acceleration platform for AI-based inference at the edge. On-device AI-based cognitive detection of bio-modality spoofing in medical cyber physical system. IEEE Access. 7, 2126–2137 (2018)
Dhar, S., Guo, J., Liu, J., Tripathi, S., Kurup, U., Shah, M.: A hardware acceleration platform for AI-based inference at the edge. A survey of on-device machine learning: An algorithms and learning theory perspective. ACM Trans. Internet Things. 2(3), 1–49 (2021)
Architecture Day 2021 Presentation: https://download.intel.com/newsroom/2021/client-computing/intel-architecture-day-2021-presentation.pdf
White Paper on AI Chip Technologies: https://www.080910t.com/downloads/AI%20Chip%202018%20EN.pdf
Hamdioui, S., **e, L., Du Nguyen, H.A., Taouil, M., Bertels, K., Corporaal, H., Jiao, H., Catthoor, F., Wouters, D., Eike, L., Van Lunteren, J.: Memristor based computation-in-memory architecture for data-intensive applications. In: 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1718–1725. IEEE (2015)
Singh, G., Chelini, L., Corda, S., Awan, A.J., Stuijk, S., Jordans, R., Corporaal, H., Boonstra, A.J.: Near-memory computing: Past, present, and future. Microprocess. Microsyst. 71, 102868 (2019)
Lightspeeur® 5801S Neural Accelerator: https://www.gyrfalcontech.ai/solutions/lightspeeur-5801/
AWS Trainium: https://aws.amazon.com/machine-learning/trainium/
AWS Inferentia: https://aws.amazon.com/machine-learning/inferentia/
Hickmann, B., Chen, J., Rotzin, M., Yang, A., Urbanski, M., Avancha, S.: Intel nervana neural network processor-T (NNP-T) fused floating point many-term dot product. In: 2020 IEEE 27th Symposium on Computer Arithmetic (ARITH), pp. 133–136, Portland, OR, USA (2020)
Gaudi® Training Platform White Paper: https://habana.ai/wp-content/uploads/pdf/2020/Habana%20GAUDI%20Training%20Whitepaper%20v1.2.pdf
Goya Inference Platform White Paper: https://habana.ai/wp-content/uploads/pdf/2020/Habana%20GOYA%20Inference%20Performance%20Whitepaper%20Nov’20.pdf
Introducing the Colossus™ MK2 GC200 IPU: https://www.graphcore.ai/products/ipu
NVIDIA A100 Tensor Core GPU Architecture: https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf
NVIDIA Tesla V100 GPU Architecture: https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf
Investor Presentation Q1 FY2022: https://s22.q4cdn.com/364334381/files/doc_financials/2022/q1/NVDA-F1Q22-Investor-Presentation-FINAL.pdf
AI Accelerator Card: https://e.huawei.com/en/products/cloud-computing-dc/atlas
Acknowledgments
This work was supported by the Brain Pool Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science and ICT (NRF-2019H1D3A1A01071115).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mishra, A., Yadav, P., Kim, S. (2023). Artificial Intelligence Accelerators. In: Mishra, A., Cha, J., Park, H., Kim, S. (eds) Artificial Intelligence and Hardware Accelerators. Springer, Cham. https://doi.org/10.1007/978-3-031-22170-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-22170-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22169-9
Online ISBN: 978-3-031-22170-5
eBook Packages: EngineeringEngineering (R0)