Log in

Facial expression recognition based on local–global information reasoning and spatial distribution of landmark features

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

In the field of facial expression recognition (FER), two main trends point to the data-driven FER and feature-driven FER exist. The former focused on the data problems (e.g., sample imbalance and multimodal fusion), while the latter explored the facial expression features. As the feature-driven FER is more important than the data-driven FER, for deeper mining of facial features, we propose an expression recognition model based on Local–Global information Reasoning and Landmark Spatial Distributions. Particularly to reason local–global information, multiple attention mechanisms with the modified residual module are designed for the Res18-LG module. In addition, taking the spatial topology of facial landmarks into account, a topological relationship graph of landmarks and a two-layer graph neural network are introduced to extract spatial distribution features. Finally, the experiment results on FERPlus and RAF-DB datasets demonstrate that our model outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Chattopadhyay, J., Kundu, S., Chakraborty, A., Banerjee, J.S.: Facial expression recognition for human computer interaction. In: International Conference on Computational Vision and Bio Inspired Computing, pp. 1181–1192. Springer (2018)

  2. Wu, S., Wang, B.: Facial expression recognition based on computer deep learning algorithm: taking cognitive acceptance of college students as an example. J. Ambient Intell. Hum. Comput. 13, 1–12 (2021)

    Google Scholar 

  3. Wolf, K.: Measuring facial expression of emotion. Dialogues Clin. Neurosci. 17(4), 457 (2022)

    Article  Google Scholar 

  4. Ye, J., Yu, Y., Fu, G., Zheng, Y., Liu, Y., Zhu, Y., Wang, Q.: Analysis and recognition of voluntary facial expression mimicry based on depressed patients. IEEE J. Biomed. Health Inform. 27(8), 3698 (2023)

    Article  Google Scholar 

  5. Kollias, D.: Multi-label compound expression recognition: C-expr database & network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5589–5598 (2023)

  6. Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. 13, 1195 (2020)

    Article  MathSciNet  Google Scholar 

  7. Huiqun, H., Gui**, S., Fenghua, H.: Summary of expression recognition technology. J. Front. Comput. Sci. Technol. 16(8), 1764 (2022)

    Google Scholar 

  8. Huang, Y., Du, C., Xue, Z., Chen, X., Zhao, H., Huang, L.: What makes multi-modal learning better than single (provably). Adv. Neural Inf. Process. Syst. 34, 10944–10956 (2021)

    Google Scholar 

  9. Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)

  10. Zhang, Y., Wang, C., Deng, W.: Relative uncertainty learning for facial expression recognition. Adv. Neural Inf. Process. Syst. 34, 17616–17627 (2021)

    Google Scholar 

  11. She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)

  12. Zhao, Z., Liu, Q., Zhou, F.: Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3510–3519 (2021)

  13. **, Y., Mao, Q., Zhou, L.: Weighted contrastive learning using pseudo labels for facial expression recognition. Vis. Comput. 39(10), 5001–5012 (2023)

    Article  Google Scholar 

  14. Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer vision, pp. 2402–2411 (2021)

  15. Saurav, S., Gidde, P., Saini, R., Singh, S.: Dual integrated convolutional neural network for real-time facial expression recognition in the wild. Vis. Comput. 38, 1–14 (2022)

    Article  Google Scholar 

  16. Li, J., **, K., Zhou, D., Kubota, N., Ju, Z.: Attention mechanism-based CNN for facial expression recognition. Neurocomputing 411, 340–350 (2020)

    Article  Google Scholar 

  17. Hu, M., Ge, P., Wang, X., Lin, H., Ren, F.: A spatio-temporal integrated model based on local and global features for video expression recognition. Vis. Comput. 38, 1–18 (2021)

    Google Scholar 

  18. Yao, L., He, S., Su, K., Shao, Q.: Facial expression recognition based on spatial and channel attention mechanisms. Wirel. Pers. Commun. 125, 1–18 (2022)

    Article  Google Scholar 

  19. Yu, M., Zheng, H., Peng, Z., Dong, J., Du, H.: Facial expression recognition based on a multi-task global-local network. Pattern Recognit. Lett. 131, 166–171 (2020)

    Article  Google Scholar 

  20. Zhang, H., Su, W., Wang, Z.: Weakly supervised local–global attention network for facial expression recognition. IEEE Access 8, 37976–37987 (2020)

    Article  Google Scholar 

  21. Kim, S., Nam, J., Ko, B.C.: Facial expression recognition based on squeeze vision transformer. Sensors 22(10), 3729 (2022)

    Article  Google Scholar 

  22. Xue, F., Wang, Q., Guo, G.: Transfer: learning relation-aware facial expression representations with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3601–3610 (2021)

  23. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth \(16 \times 16\) words: transformers for image recognition at scale. ar**v Preprint ar**v:2010.11929 (2020)

  24. Wu, H., **ao, B., Codella, N., Liu, M., Dai, X., Yuan, L., Zhang, L.: Cvt: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)

  25. Li, H., **ao, X., Liu, X., Guo, J., Wen, G., Liang, P.: Heuristic objective for facial expression recognition. Vis. Comput. 39(10), 4709–4720 (2023)

    Article  Google Scholar 

  26. Wen, Z., Lin, W., Wang, T., Xu, G.: Distract your attention: multi-head cross attention network for facial expression recognition. Biomimetics 8(2), 199 (2023)

    Article  Google Scholar 

  27. Gong, W., Fan, Y., Qian, Y.: Effective attention feature reconstruction loss for facial expression recognition in the wild. Neural Comput. Appl. 34(12), 10175–10187 (2022)

    Article  Google Scholar 

  28. **a, H., Lu, L., Song, S.: Feature fusion of multi-granularity and multi-scale for facial expression recognition. Vis. Comput. 40, 1–13 (2023)

    Google Scholar 

  29. Liang, X., Xu, L., Zhang, W., Zhang, Y., Liu, J., Liu, Z.: A convolution-transformer dual branch network for head-pose and occlusion facial expression recognition. Vis. Comput. 39(6), 2277–2290 (2023)

    Article  Google Scholar 

  30. Ma, F., Sun, B., Li, S.: Facial expression recognition with visual transformers and attentional selective fusion. IEEE Trans. Affect. Comput. 14, 1236 (2021)

    Article  Google Scholar 

  31. Zheng, C., Mendieta, M., Chen, C.: Poster: a pyramid cross-fusion transformer network for facial expression recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3146–3155 (2023)

  32. Wang, X., Wang, Y., Li, W., Du, Z., Huang, D.: Facial expression animation by landmark guided residual module. IEEE Trans. Affect. Comput. 14, 878 (2021)

    Article  Google Scholar 

  33. Ayeche, F., Alti, A.: Facial expressions recognition based on delaunay triangulation of landmark and machine learning. Traitement Signal 38(6), 1575 (2021)

    Article  Google Scholar 

  34. Hasani, B., Mahoor, M.H.: Facial expression recognition using enhanced deep 3d convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 30–40 (2017)

  35. Wang, Z., Zeng, F., Liu, S., Zeng, B.: OAENet: oriented attention ensemble for accurate facial expression recognition. Pattern Recognit. 112, 107694 (2021)

    Article  Google Scholar 

  36. Kaya, M., Bilge, H.Ş: Deep metric learning: a survey. Symmetry 11(9), 1066 (2019)

    Article  Google Scholar 

  37. Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)

  38. Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)

  39. Sebe, N., Cohen, I., Gevers, T., Huang, T.S.: Multimodal approaches for emotion recognition: a survey. In: Internet Imaging VI, vol. 5670, pp. 56–67. SPIE (2005)

  40. Mittal, T., Guhan, P., Bhattacharya, U., Chandra, R., Bera, A., Manocha, D.: Emoticon: context-aware multimodal emotion recognition using Frege’s principle. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14234–14243 (2020)

  41. Sun, B., Cao, S., He, J., Yu, L.: Affect recognition from facial movements and body gestures by hierarchical deep spatio-temporal features and fusion strategy. Neural Netw. 105, 36–51 (2018)

    Article  Google Scholar 

  42. Shi, J., Liu, C., Ishi, C.T., Ishiguro, H.: Skeleton-based emotion recognition based on two-stream self-attention enhanced spatial-temporal graph convolutional network. Sensors 21(1), 205 (2020)

    Article  Google Scholar 

  43. Huang, Y., Wen, H., Qing, L., **, R., **ao, L.: Emotion recognition based on body and context fusion in the wild. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3609–3617 (2021)

  44. Chen, J., Wang, C., Wang, K., Yin, C., Zhao, C., Xu, T., Zhang, X., Huang, Z., Liu, M., Yang, T.: HEU emotion: a large-scale database for multimodal emotion recognition in the wild. Neural Comput. Appl. 33(14), 8669–8685 (2021)

    Article  Google Scholar 

  45. Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R.: Meld: a multimodal multi-party dataset for emotion recognition in conversations. ar**v Preprint ar**v:1810.02508 (2018)

  46. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

  47. Huang, Q., Huang, C., Wang, X., Jiang, F.: Facial expression recognition with grid-wise attention and visual transformer. Inf. Sci. 580, 35–54 (2021)

    Article  MathSciNet  Google Scholar 

  48. Pecoraro, R., Basile, V., Bono, V.: Local multi-head channel self-attention for facial expression recognition. Information 13(9), 419 (2022)

    Article  Google Scholar 

  49. Liu, C., Hirota, K., Dai, Y.: Patch attention convolutional vision transformer for facial expression recognition with occlusion. Inf. Sci. 619, 781–794 (2023)

    Article  Google Scholar 

  50. Xue, F., Wang, Q., Tan, Z., Ma, Z., Guo, G.: Vision transformer with attentive pooling for robust facial expression recognition. IEEE Trans. Affect. Comput. 14, 3244–3256 (2022)

    Article  Google Scholar 

  51. Liu, Y., Zhang, X., Li, Y., Zhou, J., Li, X., Zhao, G.: Graph-based facial affect analysis: a review. IEEE Trans. Affect. Comput. 14, 2657–2677 (2022)

    Article  Google Scholar 

  52. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2008)

    Article  Google Scholar 

  53. Welling, M., Kipf, T.N.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR 2017) (2016)

  54. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. Stat 1050, 20 (2017)

    Google Scholar 

  55. Brody, S., Alon, U., Yahav, E.: How attentive are graph attention networks? In: International Conference on Learning Representations (2021)

  56. Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)

    Article  Google Scholar 

  57. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  58. Li, H., Sui, M., Zhao, F., Zha, Z., Wu, F.: MVT: mask vision transformer for facial expression recognition in the wild. ar**v Preprint ar**v:2106.04520 (2021)

  59. Ruan, D., Yan, Y., Lai, S., Chai, Z., Shen, C., Wang, H.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7660–7669 (2021)

  60. Yu, W., Xu, H.: Co-attentive multi-task convolutional neural network for facial expression recognition. Pattern Recognit. 123, 108401 (2022)

    Article  Google Scholar 

  61. Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-celeb-1M: A dataset and benchmark for large-scale face recognition. In: European Conference on Computer Vision, pp. 87–102. Springer (2016)

  62. Lo, L., **e, H., Shuai, H.H., Cheng, W.H.: Facial chirality: from visual self-reflection to robust facial feature learning. IEEE Trans. Multimed. 24, 4275–4284 (2022)

    Article  Google Scholar 

  63. Wasi, A.T., Šerbetar, K., Islam, R., Rafi, T.H., Chae, D.K.: Arbex: Attentive feature extraction with reliability balancing for robust facial expression learning. ar**v preprint ar**v:2305.01486 (2023)

  64. Ngwe, J.L., Lim, K.M., Lee, C.P., Ong, T.S.: PAtt-Lite: lightweight patch and attention MobileNet for challenging facial expression recognition. ar**v preprint ar**v:2306.09626 (2023)

  65. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  66. Shi, J., Zhu, S., Liang, Z.: Learning to amend facial expression representation via de-Albino and affinity. ar**v Preprint ar**v:2103.10189 (2021)

Download references

Acknowledgements

This work was supported by the Sichuan Science and Technology Program under Grant 2023YFS0195.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linbo Qing.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence their work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

**ong, K., Qing, L., Li, L. et al. Facial expression recognition based on local–global information reasoning and spatial distribution of landmark features. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03345-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03345-y

Keywords

Navigation