Abstract
Artificial neural networks have rapidly progressed in recent years, but are limited by the high energy costs required to train them on digital hardware. Emerging analogue hardware, such as memristor arrays, could offer improved energy efficiencies. However, the widely used backpropagation training algorithms are generally incompatible with such hardware because of mismatches between the analytically calculated training information and the imprecision of actual analogue devices. Here we report activity-difference-based training on co-designed tantalum oxide analogue memristor crossbars. Our approach, which we term memristor activity-difference energy minimization, treats the network parameters as a constrained optimization problem, and numerically calculates local gradients via Hopfield-like energy minimization using behavioural differences in the hardware targeted by the training. We use the technique to train one-layer and multilayer neural networks that can classify Braille words with high accuracy. With modelling, we show that our approach can offer over four orders of magnitude energy advantage compared with digital approaches for scaled-up problem sizes.
Similar content being viewed by others
Data availability
All the data presented in the manuscript and used to support its conclusions will be supplied by the authors upon reasonable request.
Code availability
All the simulation codes used to support the conclusions of the manuscript will be supplied by the authors upon reasonable request.
References
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 60, 84–90 (ACM, 2012).
LeCun, Y., Bengio, Y. & Hinton, G. E. Deep learning. Nature 521, 436–444 (2015).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989).
Thompson, N. C., Greenewald, K., Lee, K. & Manso, G. F. The computational limits of deep learning. Preprint at https://arxiv.org/abs/2007.05558 (2020).
Mazzoni, P., Andersen, R. A. & Jordan, M. I. A more biologically plausible learning rule for neural networks. Proc. Natl Acad. Sci. USA 88, 4433–4437 (1991).
Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003).
Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3645–3650 (Association for Computational Linguistics, 2019).
Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In ACM Conference on Fairness, Accountability, and Transparency 610–623 (ACM, 2021).
Danesh, C. D. et al. Synaptic resistors for concurrent inference and learning with high energy efficiency. Adv. Mater. 31, 1808032 (2019).
Marković, D., Mizrahi, A., Querlioz, D. & Grollier, J. Physics for neuromorphic computing. Nat. Rev. Phys. 2, 499–510 (2020).
Sokolov, A. S., Abbas, H., Abbas, Y. & Choi, C. Towards engineering in memristors for emerging memory and neuromorphic computing: a review. J. Semicond. 42, 013101 (2021).
Zhu, J., Zhang, T., Yang, Y. & Huang, R. A comprehensive review on emerging artificial neuromorphic devices. Appl. Phys. Rev. 7, 011312 (2020).
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018).
Wang, Z. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat. Mach. Intell. 1, 434–442 (2019).
**, Y. et al. In-memory learning with analog resistive switching memory: a review and perspective. Proc. IEEE 109, 14–42 (2020).
**a, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).
Lim, D.-H. et al. Spontaneous sparse learning for PCM-based memristor neural networks. Nat. Commun. 12, 319 (2021).
Sung, C., Hwang, H. & Yoo, I. K. Perspective: a review on memristive hardware for neuromorphic computation. J. Appl. Phys. 124, 151903 (2018).
Mehonic, A. et al. Memristors—from in-memory computing, deep learning acceleration, and spiking neural networks to the future of neuromorphic and bio-inspired computing. Adv. Intell. Syst. 2, 2000085 (2020).
Cramer, B. et al. Surrogate gradients for analog neuromorphic computing. Proc. Natl Acad. Sci. USA 119, e2109194119 (2022).
Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022).
Hinton, G. E., Sejnowski, T. J. & Ackley, D. H. Boltzmann Machines: Constraint Satisfaction Networks that Learn. Report No. CMU-CS-84-119 (Department of Computer Science, Carnegie-Mellon University, 1984).
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).
Movellan, J. Contrastive Hebbian learning in the continuous Hopfield model. in Connectionist Models. 10–17 (Elsevier, 1991).
**e, X. & Seung, H. S. Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Comput. 15, 441–454 (2003).
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. Difference target propagation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases 498–515 (Springer, 2015).
Spall, J. C. et al. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992).
Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).
Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).
Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. & Scellier, B. Training end-to-end analog neural networks with equilibrium propagation. Preprint at https://arxiv.org/abs/2006.01981 (2020).
Ernoult, M., Grollier, J., Querlioz, D., Bengio, Y. & Scellier, B. Updates of equilibrium prop match gradients of backprop through time in an RNN with static input. In Advances in Neural Information Processing Systems 32, 7081–7091 (Curran Associates, 2019).
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J. & Hinton, G. E. Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346 (2020).
**ao, M., Meng, Q., Zhang, Z., Wang, Y. & Lin, Z. Training feedback spiking neural networks by implicit differentiation on the equilibrium state. In Advances in Neural Information Processing Systems 34, 14516–14528 (Curran Associates, 2021).
Bai, S., Koltun, V. & Kolter, J. Z. Multiscale deep equilibrium models. In Advances in Neural Information Processing Systems 33, 5238–5250 (Curran Associates, 2020).
Bai, S., Kolter, J. Z. & Koltun, V. Deep equilibrium models. In Advances in Neural Information Processing Systems 32 (Curran Associates, 2019).
O’Connor, P., Gavves, E. & Welling, M. Training a spiking neural network with equilibrium propagation. In Proc. Twenty-Second International Conference on Artificial Intelligence and Statistics 89, 1516–1523 (PMLR, 2019).
Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized, physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).
Stern, M., Dillavou, S., Miskin, M. Z., Durian, D. J. & Liu, A. J. Physical learning beyond the quasistatic limit. Phys. Rev. Research 4, L022037 (2022).
Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl Acad. Sci. USA 79, 2554–2558 (1982).
Hopfield, J. J. Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Natl Acad. Sci. USA 81, 3088–3092 (1984).
Saxena, V. Mixed-signal neuromorphic computing circuits using hybrid CMOS-RRAM integration. IEEE Trans. Circuits Syst. II: Express Br 68, 581–586 (2020).
Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).
Kumar, S., Strachan, J. P. & Williams, R. S. Chaotic dynamics in nanoscale NbO2 Mott memristors for analogue computing. Nature 548, 318–321 (2017).
Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).
Ramsauer, H. et al. Hopfield networks is all you need. in International Conference on Learning Representations (Johannes Kepler Univ. Linz, 2021).
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).
Neftci, E. O., Pedroni, B. U., Joshi, S., Al-Shedivat, M. & Cauwenberghs, G. Stochastic synapses enable efficient brain-inspired learning machines. Front. Neurosci. 10, 241 (2016).
Neftci, E. O., Das, S., Pedroni, B. U., Kreutz-Delgado, K. & Cauwenberghs, G. Event-driven contrastive divergence for spiking neuromorphic systems. Front. Neurosci. 7, 272 (2014).
Acknowledgements
R. Pantone and X. Sheng are gratefully acknowledged for feedback on the manuscript and/or assistance with the experiments. S.Y. and R.S.W. were partly supported by the Air Force Office of Scientific Research (AFOSR) under grant no. AFOSR-FA9550-19-0213, titled ‘Brain Inspired Networks for Multifunctional Intelligent Systems in Aerial Vehicles’. R.S.W. acknowledges the X-Grants Program of the President’s Excellence Fund at Texas A&M University. We acknowledge the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory operated for the US Department of Energy (DOE)’s National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analyses. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the United States Government. Part of this work was performed at the Stanford Nano Shared Facilities (SNSF), supported by the National Science Foundation under award ECCS-2026822. This research used resources of the Advanced Light Source, a US DOE Office of Science User Facility under contract DE-AC02-05CH11231.
Author information
Authors and Affiliations
Contributions
All the authors contributed to the conception of the ideas, literature review, writing of the manuscript, preparation of the figures and editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Electronics thanks Hyung** Kim and Huaqiang Wu for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Sections 1–16 and Figs. 1–27.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yi, Si., Kendall, J.D., Williams, R.S. et al. Activity-difference training of deep neural networks using memristor crossbars. Nat Electron 6, 45–51 (2023). https://doi.org/10.1038/s41928-022-00869-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41928-022-00869-w
- Springer Nature Limited
This article is cited by
-
Training an Ising machine with equilibrium propagation
Nature Communications (2024)
-
Simple Circuit Implementation of String Scaling Fractional-order Memristor with Fixed Valid Frequency Range
Nonlinear Dynamics (2024)
-
Combinational logic circuits based on a power- and area-efficient memristor with low variability
Journal of Computational Electronics (2024)