Abstract
The use of gradient-based methods are ubiquitously used to update the internal parameters of neural networks. Problems commonly associated with gradient based methods are the tendency for the algorithms to get stuck in sub-optimal local minima, and their slow convergence rate. Efficacious solutions to these issues, such as the addition of “momentum” and adaptive learning rates, have been offered. In this paper, we investigate the efficacy of using particle swarm optimization (PSO) to help gradient-based methods search for the optimal internal parameters to minimize the loss function of a convolutional neural network (CNN). We compare the metric performance of traditional gradient-baseds method with and without the use of a PSO to either guide or refine the search for the optimal weights. The gradient-based methods we examine are stochastic gradient descent with and without a momentum term, as well as Adaptive Moment Estimation (Adam). We find that, with the exception of the Adam optimized networks, regular gradient-based methods achieve better metric scores than when used in conjunction with a PSO. We also observe that using a PSO to refine the solution found through a gradient-descent technique reduces loss better than when using a PSO to dictate that starting solution for gradient descent. Ultimately, the best solution on the MNIST dataset was achieved by the network optimized with stochastic gradient descent and momentum with an average loss score of 0.0092 when evaluated using k-fold cross validation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lecun, Y.: The power and limits of deep learning. Res. Technol. Manage. 61, 22–27 (2018)
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Wang, Y.-J.: Improving particle swarm optimization performance with local search for high-dimensional function optimization. Optim. Methods Softw. 25(5), 781–795 (2010)
Noel, M.M.: A new gradient based particle swarm optimization algorithm for accurate computation of global minimum. Appl. Soft Comput. 12(1), 353–359 (2012)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). MIT Press Ltd. (2017)
Cauchy, A.: Méthode générale pour la résolution des systèmes d’équations simultanées. Comptes Rendus 25(2), 536–538 (1847)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Montana, D.: Neural network weight selection using genetic algorithms. Intell. Hybrid Syst. 8(6), 9–12 (1995)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN95 - International Conference on Neural Networks (1995)
Oldewage, E.T.: The perils of particle swarm optimization in high dimensional problem spaces. University of Pretoria (2017)
Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN 2002 (Cat. No. 02CH37290) (2002)
Ding, S., Su, C., Yu, J.: An optimizing BP neural network algorithm based on genetic algorithm. Artif. Intell. Rev. 36(2), 153–162 (2011)
LeCun, Y., Cortes, C., Burges, C.: The mnist database, November 1998
“Papers with code - mnist benchmark (image classification).”
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wessels, S., van der Haar, D. (2021). Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-93420-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)