Abstract
To address the problem that traditional convolutional neural networks cannot classify facial expression image features precisely, an interpretable face expression recognition method combining ResNet18 residual network and support vector machines (SVM) is proposed in the paper. The SVM classifier is used to enhance the matching ability of feature vectors and labels under the expression image feature space, to improve the expression recognition effect of the whole model. The class activation map** and t-distributed stochastic neighbor embedding methods are used to visualize and interpret facial expression recognition’s feature analysis and decision making under the residual neural network. The experimental results and the interpretable visualization analysis show that the algorithm structure can effectively improve the recognition ability of the network. Under the FER2013, JAFFE, and CK+ datasets, it achieved 67.65%, 84.44%, and 96.94% emotional recognition accuracy, respectively, showing a certain generalization ability and superior performance.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-023-02657-1/MediaObjects/11760_2023_2657_Fig8_HTML.png)
Similar content being viewed by others
Data availability
All the data included in this study are available upon request by contacting the corresponding author.
References
Shan, L., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. 13, 1195–1215 (2018)
Yan, W., et al.: A systematic review on affective computing: emotion models, databases, and recent advances. Inf. Fus. 83–84 (2022)
Mehrabian, A., Russell, J.A.: An Approach to Environment Psychology. MIT (1974)
Quan, C., Yao, Q., Ren, F.: Dynamic facial expression recognition based on K-order emotional intensity model. In: IEEE International Conference on Robotics & Biomimetics IEEE (2015)
Wen-qiang, G.U.O., et al.: Facial expression recognition with small data sets based by bayesian network modeling. Sci. Technol. Eng. 18(35), 179–183 (2018)
Makhmudkhujaev, F., et al.: Facial expression recognition with local prominent directional pattern. Signal Process. Image Commun. 74, 1–12 (2019)
Yang, H., Ciftci U., Yin, L.: Facial expression recognition by de-expression residue learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) IEEE (2018)
Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Workshop on Applications of Computer Vision IEEE (2021)
Zhao, J., Zhou, Y., Wang, X., et al.: Facial expression recognition method based on branch-assisted learning network. Comput. Eng. Appl. 58(23), 151–160 (2022)
Tan, K., Chen, J., Wang, D.L.: Gated residual networks with dilated convolutions for supervised speech separation. In: IEEE International Conference on Acoustics IEEE (2018)
He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. CoRR, http://arxiv.org/abs/1512.03385 (2015)
Mingyang, L.A.N., Yulong, L.I.U., Tao, J.I.N., et al.: An improved recognition method based on visual trajectory circle and ResnetN18 for complex power quality disturbances. Proc. CSEE 42(17), 6274–6286 (2022)
Lingmin, L.I., Mengran, H.O.U., Kun, C.H.E.N., et al.: Survey on interpretability research of deep learning. J. Comput. Appl. 42(12), 3639–3650 (2022)
Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. CoRR, http://arxiv.org/abs/1512.04150 (2015)
Mohamed, E., Sirlantzis, K., Howells, G.: A review of visualisation-as-explanation techniques for convolutional neural networks and their evaluation. Displays 73, 102239 (2022)
Wang, X., Gu, Y.: Classification of macular abnormalities using a lightweight CNN-SVM framework. Meas. Sci. Technol. 6, 33 (2022)
Anowar, F., Sadaoui, S., Selim, B.: Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE). Comput. Sci. Rev. 40, 100378 (2021)
Dongyu, S.H.I., et al.: Study on visualization method of electrical distance in power system using t-SNE. Electr. Power Eng. Technol. 37(02), 78–82 (2018)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101 (2010)
Goodfellow, I.J., Erhan, D., Carrier, P.L., et al.: Challenges in representation learning: a report on three machine learning contests. In: Neural Information Processing: 20th International Conference. Proceedings, Part III, vol. 20, pp. 117–124. Springer, Berlin (2013)
Lyons, M.J., Kamachi, M., Gyoba, J.: The Japanese female facial expression (JAFFE) database. In: The 3th International Conference on Automatic Face And Gesture Recognition, pp. 14–16 (1997)
Pahikkala, T., et al.: Efficient Hold-Out for Subset of Regressors. Springer, Berlin (2010)
Si, N.-W., Zhang, W.-L., Qu, D., et al.: Representation visualization of convolutional neural networks: a survey. Acta Automatica Sinica 48(08), 1890–1920 (2022)
Xu, L.L., Zhang, S.M., Zhao, J.L.: Expression recognition algorithm for parallel convolutional neural networks. J. Image Graph. 24(02), 227–236 (2019)
Agrawal, A., Mittal, N.: Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 36(2), 405–412 (2020)
**e, S., Hu, H., Wu, Y.: Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recognit. 92, 177–191 (2019)
Mollahosseini, A., Chan, D., Mahoor, M.H.: Going deeper in facial expression recognition using deep neural networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE (2016)
Arriaga, Octavio, Matias Valdenegro-Toro, Paul Plöger: Real-time convolutional neural networks for emotion and gender classification. http://arxiv.org/abs/1710.07557 (2017)
Zhang, P., Kong, W.W., Teng, J.B.: Facial expression recognition based on multi-scale feature attention mechanism. Comput. Eng. Appl. 58(01), 182–189 (2022)
Cheng, H.X., Wang, X., Cheng, L., et al.: Facial expression recognition model design based on CNN and LSTM. Electron. Meas. Technol. 44(17), 160–164 (2021)
Funding
This research was received by the Shanghai “Science and Technology Innovation Action Plan” Artificial Intelligence Science and Technology Support Special Project (20511101600).
Author information
Authors and Affiliations
Contributions
JL and WS contributed to the conception of the study and contributed significantly to analysis and manuscript preparation; GX provided pre-theoretical support for this research.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ji, L., Wu, S. & Gu, X. A facial expression recognition algorithm incorporating SVM and explainable residual neural network. SIViP 17, 4245–4254 (2023). https://doi.org/10.1007/s11760-023-02657-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02657-1