Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks

Wessels, Steven; van der Haar, Dustin

doi:10.1007/978-3-030-93420-0_12

Steven Wessels¹¹ &
Dustin van der Haar¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12702))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

833 Accesses
2 Citations

Abstract

The use of gradient-based methods are ubiquitously used to update the internal parameters of neural networks. Problems commonly associated with gradient based methods are the tendency for the algorithms to get stuck in sub-optimal local minima, and their slow convergence rate. Efficacious solutions to these issues, such as the addition of “momentum” and adaptive learning rates, have been offered. In this paper, we investigate the efficacy of using particle swarm optimization (PSO) to help gradient-based methods search for the optimal internal parameters to minimize the loss function of a convolutional neural network (CNN). We compare the metric performance of traditional gradient-baseds method with and without the use of a PSO to either guide or refine the search for the optimal weights. The gradient-based methods we examine are stochastic gradient descent with and without a momentum term, as well as Adaptive Moment Estimation (Adam). We find that, with the exception of the Adam optimized networks, regular gradient-based methods achieve better metric scores than when used in conjunction with a PSO. We also observe that using a PSO to refine the solution found through a gradient-descent technique reduces loss better than when using a PSO to dictate that starting solution for gradient descent. Ultimately, the best solution on the MNIST dataset was achieved by the network optimized with stochastic gradient descent and momentum with an average loss score of 0.0092 when evaluated using k-fold cross validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adaptive Momentum Coefficient for Neural Network Optimization

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

Particle swarm optimization performance improvement using deep learning techniques

Article 29 March 2022

References

Lecun, Y.: The power and limits of deep learning. Res. Technol. Manage. 61, 22–27 (2018)
Article Google Scholar
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Wang, Y.-J.: Improving particle swarm optimization performance with local search for high-dimensional function optimization. Optim. Methods Softw. 25(5), 781–795 (2010)
Article MathSciNet Google Scholar
Noel, M.M.: A new gradient based particle swarm optimization algorithm for accurate computation of global minimum. Appl. Soft Comput. 12(1), 353–359 (2012)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). MIT Press Ltd. (2017)
Google Scholar
Cauchy, A.: Méthode générale pour la résolution des systèmes d’équations simultanées. Comptes Rendus 25(2), 536–538 (1847)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Montana, D.: Neural network weight selection using genetic algorithms. Intell. Hybrid Syst. 8(6), 9–12 (1995)
Google Scholar
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN95 - International Conference on Neural Networks (1995)
Google Scholar
Oldewage, E.T.: The perils of particle swarm optimization in high dimensional problem spaces. University of Pretoria (2017)
Google Scholar
Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN 2002 (Cat. No. 02CH37290) (2002)
Google Scholar
Ding, S., Su, C., Yu, J.: An optimizing BP neural network algorithm based on genetic algorithm. Artif. Intell. Rev. 36(2), 153–162 (2011)
Article Google Scholar
LeCun, Y., Cortes, C., Burges, C.: The mnist database, November 1998
Google Scholar
“Papers with code - mnist benchmark (image classification).”
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015
Google Scholar

Download references

Author information

Authors and Affiliations

Academy of Computer Science and Software Engineering, University of Johannesburg, Gauteng, South Africa
Steven Wessels & Dustin van der Haar

Authors

Steven Wessels
View author publications
You can also search for this author in PubMed Google Scholar
Dustin van der Haar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dustin van der Haar .

Editor information

Editors and Affiliations

Universidade do Porto, Porto, Portugal
João Manuel R. S. Tavares
Universidade Estadual Paulista, São Paulo, Brazil
João Paulo Papa
University of the Balearic Islands, Palma de Mallorca, Spain
Manuel González Hidalgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wessels, S., van der Haar, D. (2021). Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-93420-0_12
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Momentum Coefficient for Neural Network Optimization

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

Particle swarm optimization performance improvement using deep learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Momentum Coefficient for Neural Network Optimization

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

Particle swarm optimization performance improvement using deep learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation