Log in

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

  • Original Contribution
  • Published:
Journal of The Institution of Engineers (India): Series B Aims and scope Submit manuscript

Abstract

Facial Expression Recognition (FER) is at the heart of Human–Computer Interaction (HCI) and has received a lot of attention in the field of computer vision. We present a novel attention-based deep neural network for recognizing facial expressions from images. Initially, regions such as eye-pair, mouth and face are cropped, independently passed through the pre-trained Xception network to obtain deep representations. All of these descriptors may not have same influence while recognizing the type of expression, and some of them may require special attention over others depending on the type of expression. We incorporate attention mechanism into the model to automatically learn the amount of attention to be paid to each descriptor. These attention-based features obtained from all the three regions are then fused using the proposed Cross Average Pooling (CAP) layers to produce a compact and discriminatory representation that ultimately leads to better identification of facial expressions. The proposed cross average pooled soft attention results in compact and discriminatory representations for facial images, allowing for more accurate predictions. The proposed approach is evaluated on two benchmark datasets (JAFFE and CK+), and the experimental results reveal that the proposed model outperforms existing models with an accuracy of 97.67 and 97.46% on JAFFE and CK+ datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Y. Wang, Y. Li, Y. Song, X. Rong, The influence of the activation function in a convolution neural network model of facial expression recognition. Appl. Sci. 10(5), 1897 (2020)

    Article  Google Scholar 

  2. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1 (IEEE, 2005), pp. 886–893

  3. R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 580–587

  4. J.D. Bodapati, N.S. Shaik, V. Naralasetti, Deep convolution feature aggregation: an application to diabetic retinopathy severity level prediction, in Signal, Image and Video Processing (2021), pp. 1–8

  5. J.D. Bodapati, N. Veeranjaneyulu, Facial emotion recognition using deep CNN based features. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(7), 1928–1931 (2019)

    Google Scholar 

  6. S. **e, H. Hu, Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks. IEEE Trans. Multimedia 21(1), 211–220 (2018)

    Article  MathSciNet  Google Scholar 

  7. S. Singh, F. Nasoz, Facial expression recognition with convolutional neural networks, in 10th Annual Computing and Communication Workshop and Conference (CCWC) (IEEE, 2020), pp. 0324–0328

  8. A. Agrawal, N. Mittal, Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 36(2), 405–412 (2020)

    Article  Google Scholar 

  9. P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 1 (IEEE, 2001), p. I

  10. Y. Sun, X. Wang, X. Tang, Deep convolutional network cascade for facial point detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013), pp. 3476–3483

  11. A. Mollahosseini, D. Chan, M.H. Mahoor, Going deeper in facial expression recognition using deep neural networks, in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) (IEEE, 2016), pp. 1–10

  12. Z. Yu, C. Zhang, Image based static facial expression recognition with multiple deep network learning, in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. (2015), pp. 435–442

  13. D.A. Pitaloka, A. Wulandari, T. Basaruddin, D.Y. Liliana, Enhancing CNN with preprocessing stage in automatic emotion recognition. Procedia Comput. Sci. 116, 523–529 (2017)

    Article  Google Scholar 

  14. T. Hassner et al., Effective face frontalization in unconstrained images, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 4295–4304

  15. P. Hu et al., Learning supervised scoring ensemble for emotion recognition in the wild, in Proceedings of the 19th ACM International Conference on Multimodal Interaction (2017) pp. 553–560

  16. V. Gupta, M. Mittal, R-peak detection for improved analysis in health informatics. Int. J. Med. Eng. Inf. 13(3), 213–223 (2021)

    Google Scholar 

  17. S.L. Happy, A. Routray, Automatic facial expression recognition using features of salient facial patches. IEEE Trans. Affect. Comput. 6(1), 1–12 (2014)

    Article  Google Scholar 

  18. J.D. Bodapati, N. Veeranjaneyulu, Abnormal network traffic detection using support vector data description, in Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications (Springer, 2017), pp. 497–506

  19. P. Carcagnì et al., Facial expression recognition and histograms of oriented gradients: a comprehensive study. Springerplus 4(1), 645 (2015)

    Article  Google Scholar 

  20. M. Dahmane, J. Meunier. Emotion recognition using dynamic grid-based HoG features, in Face and Gesture 2011 (IEEE, 2011), pp. 884–888

  21. T. Zhang et al., A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimedia 1812, 2528–2536 (2016)

    Article  Google Scholar 

  22. G. Wenfei et al., Facial expression recognition using radial encoding of local Gabor features and classifier synthesis. Pattern Recogn. 45(1), 80–91 (2012)

    Article  Google Scholar 

  23. M.S. Zia, M.A. Jaffar, An adaptive training based on classification system for patterns in facial expressions using SURF descriptor templates. Multimedia Tools Appl. 74(11), 3881–3899 (2015)

    Article  Google Scholar 

  24. C. Shan, S. Gong, P.W. McOwan, Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)

    Article  Google Scholar 

  25. Y. Luo, W. Cai-Ming, Y. Zhang, Facial expression recognition based on fusion feature of PCA and LBP with SVM. Opt.-Int. J. Light Electron Opt. 124(17), 2767–2770 (2013)

    Article  Google Scholar 

  26. F. Cheng, Y. Jiangsheng, H. **ong, Facial expression recognition in JAFFE dataset based on Gaussian process classification. IEEE Trans. Neural Netw. 21(10), 1685–1690 (2010)

    Article  Google Scholar 

  27. V. Gupta, M. Mittal, V. Mittal, R-peak detection using chaos analysis in standard and real time ECG databases. IRBM 40(6), 341–354 (2019)

    Article  Google Scholar 

  28. J.D. Bodapati, U. Srilakshmi, N. Veeranjaneyulu. FERNet: a deep CNN architecture for facial expression recognition in the wild, in Journal of The institution of engineers (India): series B (2021), pp. 1–10

  29. P. Burkert et al. Dexpression: deep convolutional neural network for expression recognition. ar**v preprint ar**v:1509.05371 (2015)

  30. D. Hamester, P. Barros, S. Wermter, Face expression recognition with a 2-channel convolutional neural network, in 2015 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2015), pp. 1–8

  31. P. Liu et al. Facial expression recognition via a boosted deep belief network, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1805–1812

  32. M. Liu et al., Au-inspired deep networks for facial expression feature learning. Neurocomputing 159, 126–136 (2015)

    Article  Google Scholar 

  33. M. Liu et al. Au-aware deep networks for facial expression recognition, in 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) (IEEE, 2013), pp. 1–6

  34. P. Khorrami, T. Paine, T. Huang, Do deep neural networks learn facial action units when doing expression recognition? in Proceedings of the IEEE International Conference on Computer Vision Workshops (2015), pp. 19–27

  35. B. Yang et al., Facial expression recognition using weighted mixture deep neural network based on doublechannel facial images. IEEE Access 6, 4630–4640 (2017)

    Article  Google Scholar 

  36. G. Wen et al., Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn. Comput. 9(5), 597–610 (2017)

    Article  Google Scholar 

  37. A.T. Lopes et al., Facial expression recognition with convolutional neural networks: co** with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)

    Article  Google Scholar 

  38. I. Goodfellow et al. Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680

  39. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)

  40. K. He et al. Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778

  41. F. Chollet. Xception: deep learning with depthwise separable convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 1251–1258

  42. C. Szegedy et al., Inception-v4, inception-resnet and the impact of residual connections on learning, in 31st AAAI Conference on Artificial Intelligence (2017)

  43. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105

  44. H.-W. Ng et al., Deep learning for emotion recognition on small datasets using transfer learning, in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (2015), pp. 443–449

  45. V. Kazemi, J. Sullivan, One millisecond face alignment with an ensemble of regression trees, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1867–1874

  46. J. Michael, M.K. Lyons, J. Gyoba, Japanese female facial expressions (JAFFE), in Database of Digital Images (1997)

  47. P. Lucey et al., The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotionspecified expression, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (IEEE, 2010), pp. 94–101

  48. T. Kanade, J.F. Cohn, Y. Tian, Comprehensive database for facial expression analysis, in Proceedings 4th IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580) (IEEE, 2000), pp. 46–53

Download references

Funding

No Funding

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyostna Devi Bodapati.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bodapati, J.D., Naik, D.S.B., Suvarna, B. et al. A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition. J. Inst. Eng. India Ser. B 103, 1395–1405 (2022). https://doi.org/10.1007/s40031-022-00746-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40031-022-00746-2

Keywords

Navigation