A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Bodapati, Jyostna Devi; Naik, D S Bhupal; Suvarna, B; Naralasetti, Veeranjaneyulu

doi:10.1007/s40031-022-00746-2

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Original Contribution
Published: 04 May 2022

Volume 103, pages 1395–1405, (2022)
Cite this article

Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Jyostna Devi Bodapati ORCID: orcid.org/0000-0002-5185-882X¹,
D S Bhupal Naik¹,
B Suvarna¹ &
…
Veeranjaneyulu Naralasetti²

355 Accesses
10 Citations
Explore all metrics

Abstract

Facial Expression Recognition (FER) is at the heart of Human–Computer Interaction (HCI) and has received a lot of attention in the field of computer vision. We present a novel attention-based deep neural network for recognizing facial expressions from images. Initially, regions such as eye-pair, mouth and face are cropped, independently passed through the pre-trained Xception network to obtain deep representations. All of these descriptors may not have same influence while recognizing the type of expression, and some of them may require special attention over others depending on the type of expression. We incorporate attention mechanism into the model to automatically learn the amount of attention to be paid to each descriptor. These attention-based features obtained from all the three regions are then fused using the proposed Cross Average Pooling (CAP) layers to produce a compact and discriminatory representation that ultimately leads to better identification of facial expressions. The proposed cross average pooled soft attention results in compact and discriminatory representations for facial images, allowing for more accurate predictions. The proposed approach is evaluated on two benchmark datasets (JAFFE and CK+), and the experimental results reveal that the proposed model outperforms existing models with an accuracy of 97.67 and 97.46% on JAFFE and CK+ datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

A Multi-region Feature Extraction and Fusion Strategy Based CNN-Attention Network for Facial Expression Recognition

Visual attention based composite dense neural network for facial expression recognition

Article 17 April 2022

Fusing Multi-scale Binary Convolution with Joint Attention Face Expression Recognition Algorithm

References

Y. Wang, Y. Li, Y. Song, X. Rong, The influence of the activation function in a convolution neural network model of facial expression recognition. Appl. Sci. 10(5), 1897 (2020)
Article Google Scholar
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1 (IEEE, 2005), pp. 886–893
R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 580–587
J.D. Bodapati, N.S. Shaik, V. Naralasetti, Deep convolution feature aggregation: an application to diabetic retinopathy severity level prediction, in Signal, Image and Video Processing (2021), pp. 1–8
J.D. Bodapati, N. Veeranjaneyulu, Facial emotion recognition using deep CNN based features. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(7), 1928–1931 (2019)
Google Scholar
S. **e, H. Hu, Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks. IEEE Trans. Multimedia 21(1), 211–220 (2018)
Article MathSciNet Google Scholar
S. Singh, F. Nasoz, Facial expression recognition with convolutional neural networks, in 10th Annual Computing and Communication Workshop and Conference (CCWC) (IEEE, 2020), pp. 0324–0328
A. Agrawal, N. Mittal, Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 36(2), 405–412 (2020)
Article Google Scholar
P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 1 (IEEE, 2001), p. I
Y. Sun, X. Wang, X. Tang, Deep convolutional network cascade for facial point detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013), pp. 3476–3483
A. Mollahosseini, D. Chan, M.H. Mahoor, Going deeper in facial expression recognition using deep neural networks, in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) (IEEE, 2016), pp. 1–10
Z. Yu, C. Zhang, Image based static facial expression recognition with multiple deep network learning, in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. (2015), pp. 435–442
D.A. Pitaloka, A. Wulandari, T. Basaruddin, D.Y. Liliana, Enhancing CNN with preprocessing stage in automatic emotion recognition. Procedia Comput. Sci. 116, 523–529 (2017)
Article Google Scholar
T. Hassner et al., Effective face frontalization in unconstrained images, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 4295–4304
P. Hu et al., Learning supervised scoring ensemble for emotion recognition in the wild, in Proceedings of the 19th ACM International Conference on Multimodal Interaction (2017) pp. 553–560
V. Gupta, M. Mittal, R-peak detection for improved analysis in health informatics. Int. J. Med. Eng. Inf. 13(3), 213–223 (2021)
Google Scholar
S.L. Happy, A. Routray, Automatic facial expression recognition using features of salient facial patches. IEEE Trans. Affect. Comput. 6(1), 1–12 (2014)
Article Google Scholar
J.D. Bodapati, N. Veeranjaneyulu, Abnormal network traffic detection using support vector data description, in Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications (Springer, 2017), pp. 497–506
P. Carcagnì et al., Facial expression recognition and histograms of oriented gradients: a comprehensive study. Springerplus 4(1), 645 (2015)
Article Google Scholar
M. Dahmane, J. Meunier. Emotion recognition using dynamic grid-based HoG features, in Face and Gesture 2011 (IEEE, 2011), pp. 884–888
T. Zhang et al., A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimedia 1812, 2528–2536 (2016)
Article Google Scholar
G. Wenfei et al., Facial expression recognition using radial encoding of local Gabor features and classifier synthesis. Pattern Recogn. 45(1), 80–91 (2012)
Article Google Scholar
M.S. Zia, M.A. Jaffar, An adaptive training based on classification system for patterns in facial expressions using SURF descriptor templates. Multimedia Tools Appl. 74(11), 3881–3899 (2015)
Article Google Scholar
C. Shan, S. Gong, P.W. McOwan, Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
Article Google Scholar
Y. Luo, W. Cai-Ming, Y. Zhang, Facial expression recognition based on fusion feature of PCA and LBP with SVM. Opt.-Int. J. Light Electron Opt. 124(17), 2767–2770 (2013)
Article Google Scholar
F. Cheng, Y. Jiangsheng, H. **ong, Facial expression recognition in JAFFE dataset based on Gaussian process classification. IEEE Trans. Neural Netw. 21(10), 1685–1690 (2010)
Article Google Scholar
V. Gupta, M. Mittal, V. Mittal, R-peak detection using chaos analysis in standard and real time ECG databases. IRBM 40(6), 341–354 (2019)
Article Google Scholar
J.D. Bodapati, U. Srilakshmi, N. Veeranjaneyulu. FERNet: a deep CNN architecture for facial expression recognition in the wild, in Journal of The institution of engineers (India): series B (2021), pp. 1–10
P. Burkert et al. Dexpression: deep convolutional neural network for expression recognition. ar**v preprint ar**v:1509.05371 (2015)
D. Hamester, P. Barros, S. Wermter, Face expression recognition with a 2-channel convolutional neural network, in 2015 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2015), pp. 1–8
P. Liu et al. Facial expression recognition via a boosted deep belief network, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1805–1812
M. Liu et al., Au-inspired deep networks for facial expression feature learning. Neurocomputing 159, 126–136 (2015)
Article Google Scholar
M. Liu et al. Au-aware deep networks for facial expression recognition, in 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (IEEE, 2013), pp. 1–6
P. Khorrami, T. Paine, T. Huang, Do deep neural networks learn facial action units when doing expression recognition? in Proceedings of the IEEE International Conference on Computer Vision Workshops (2015), pp. 19–27
B. Yang et al., Facial expression recognition using weighted mixture deep neural network based on doublechannel facial images. IEEE Access 6, 4630–4640 (2017)
Article Google Scholar
G. Wen et al., Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn. Comput. 9(5), 597–610 (2017)
Article Google Scholar
A.T. Lopes et al., Facial expression recognition with convolutional neural networks: co** with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)
Article Google Scholar
I. Goodfellow et al. Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)
K. He et al. Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778
F. Chollet. Xception: deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 1251–1258
C. Szegedy et al., Inception-v4, inception-resnet and the impact of residual connections on learning, in 31st AAAI Conference on Artificial Intelligence (2017)
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105
H.-W. Ng et al., Deep learning for emotion recognition on small datasets using transfer learning, in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (2015), pp. 443–449
V. Kazemi, J. Sullivan, One millisecond face alignment with an ensemble of regression trees, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1867–1874
J. Michael, M.K. Lyons, J. Gyoba, Japanese female facial expressions (JAFFE), in Database of Digital Images (1997)
P. Lucey et al., The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotionspecified expression, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (IEEE, 2010), pp. 94–101
T. Kanade, J.F. Cohn, Y. Tian, Comprehensive database for facial expression analysis, in Proceedings 4th IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580) (IEEE, 2000), pp. 46–53

Download references

Funding

No Funding

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Vignan’s Foundation for Science Technology and Research, Vadlamudi, India
Jyostna Devi Bodapati, D S Bhupal Naik & B Suvarna
Department of Information Technology, Vignan’s Foundation for Science Technology and Research, Vadlamudi, India
Veeranjaneyulu Naralasetti

Authors

Jyostna Devi Bodapati
View author publications
You can also search for this author in PubMed Google Scholar
D S Bhupal Naik
View author publications
You can also search for this author in PubMed Google Scholar
B Suvarna
View author publications
You can also search for this author in PubMed Google Scholar
Veeranjaneyulu Naralasetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jyostna Devi Bodapati.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bodapati, J.D., Naik, D.S.B., Suvarna, B. et al. A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition. J. Inst. Eng. India Ser. B 103, 1395–1405 (2022). https://doi.org/10.1007/s40031-022-00746-2

Download citation

Received: 07 April 2021
Accepted: 09 April 2022
Published: 04 May 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s40031-022-00746-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-region Feature Extraction and Fusion Strategy Based CNN-Attention Network for Facial Expression Recognition

Visual attention based composite dense neural network for facial expression recognition

Fusing Multi-scale Binary Convolution with Joint Attention Face Expression Recognition Algorithm

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-region Feature Extraction and Fusion Strategy Based CNN-Attention Network for Facial Expression Recognition

Visual attention based composite dense neural network for facial expression recognition

Fusing Multi-scale Binary Convolution with Joint Attention Face Expression Recognition Algorithm

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation