FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper

  • Conference paper
  • First Online:
Artificial Intelligence in HCI (HCII 2024)

Abstract

The ability to equip robots with social skills in terms of making human-robot interaction more natural, authentic, and lifelike is a challenging task in the domain of human-robot communication. A key component in doing this is the robot’s aptitude to perceive and understand human emotional states. In the larger domains of human-machine interaction and affective computing, emotion detection has received a lot of attention. In this research, an improved facial expression recognition framework is developed for the humanoid robot Pepper that allows Pepper to recognize human facial emotions beyond seven basic expressions. Three unique facial expressions mockery, think and wink are introduced along with seven basic expressions anger, disgust, happy, neutral, fear, sad and surprise. Several deep learning models, transformer: MobileNetV2, Residual attention network, Vision transformer (ViT) and EfficientNetV2 are assigned to this Facial Emotion Recognition (FER) task during the experiment. EfficientNetV2 is proved to be more robust in FER outperforming other candidate models achieving validation accuracy, recall and F1 score of 88.23%, 88.61% and 88.19% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 63.34
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 79.17
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Goodrich, M.A., Schultz, A.C.: Human-Robot Interaction: A Survey. Now Publishers Inc., Norwell (2008)

    Google Scholar 

  2. Sheridan, T.B.: Human-robot interaction: status and challenges. Hum. Fact. 58(4), 525–532 (2016)

    Article  Google Scholar 

  3. Camras, L.A., Sachs-Alter, E., Ribordy, S.C.: Emotion Understanding in Maltreated Children: Recognition of Facial Expressions and Integration with Other Emotion Cues. Lawrence Erlbaum Associates Inc., Mahwah (1996)

    Google Scholar 

  4. Sikka, K.: Facial expression analysis for estimating pain in clinical settings. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 349–353 (2014)

    Google Scholar 

  5. Pierson, H.A., Gashler, M.S.: Deep learning in robotics: a review of recent research. Adv. Rob. 31(16), 821–835 (2017)

    Article  Google Scholar 

  6. Károly, A.I., Galambos, P., Kuti, J., Rudas, I.J.: Deep learning in robotics: survey on model structures and training strategies. IEEE Trans. Syst. Man Cybern. Syst. 51(1), 266–279 (2020)

    Article  Google Scholar 

  7. Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial expression recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 336–341 (2019)

    Google Scholar 

  8. Ahmed, T.U., Jamil, M.N., Hossain, M.S., Andersson, K., Hossain, M.S.: An integrated real-time deep learning and belief rule base intelligent system to assess facial expression under uncertainty. In: 2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–6 (2020)

    Google Scholar 

  9. Ni, R., Yang, B., Zhou, X., Cangelosi, A., Liu, X.: Facial expression recognition through cross-modality attention fusion. IEEE Trans. Cogn. Dev. Syst. 15(1), 175–185 (2022)

    Article  Google Scholar 

  10. Ruiz-Garcia, A., Webb, N., Palade, V., Eastwood, M., Elshaw, M.: Deep learning for real time facial expression recognition in social robots. In: Cheng, L., Leung, A., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 392–402. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-030-04221-9_35

    Chapter  Google Scholar 

  11. Zhang, J., **ao, N.: Capsule network-based facial expression recognition method for a humanoid robot. In: Recent Trends in Intelligent Computing, Communication and Devices, pp. 113–121. Springer, Heidelberg (2020). https://doi.org/10.1007/978-981-13-9406-5_15

  12. Ilyas, C.M.A., Schmuck, V., Haque, M.A., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Teaching pepper robot to recognize emotions of traumatic brain injured patients using deep neural networks. In: 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp. 1–7. IEEE (2019)

    Google Scholar 

  13. Spezialetti, M., Placidi, G., Rossi, S.: Emotion recognition for human-robot interaction: recent advances and future perspectives. Front. Rob. AI 7, 532279 (2020)

    Article  Google Scholar 

  14. Alonso-Martin, F., Malfaz, M., Sequeira, J., Gorostiza, J.F., Salichs, M.A.: A multimodal emotion detection system during human-robot interaction. Sensors 13(11), 15549–15581 (2013)

    Article  Google Scholar 

  15. Pandey, A.K., Gelin, R.: A mass-produced sociable humanoid robot: pepper: The first machine of its kind. IEEE Rob. Autom. Maga. 25(3), 40–48 (2018)

    Article  Google Scholar 

  16. Seo, K.I.S.U.N.G., Robotics, A.L.D.E.B.A.R.A.N.: Using nao: introduction to interactive humanoid robots. AldeBaran Rob. (2013)

    Google Scholar 

  17. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: Thirty-First AAAI Conference on Artificial Intelligence. ar**v preprint ar**v:2010.11929 (2020)

  18. Tan, M., Le, Q.: Efficientnetv2: smaller models and faster training. In:: International Conference on Machine Learning (2021)

    Google Scholar 

  19. Aifanti, N., Papachristou, C., Delopoulos, A.: The MUG facial expression database. In: 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, pp. 1–4 (2010)

    Google Scholar 

  20. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2007)

    Google Scholar 

  21. Giannopoulos, P., Perikos, I., Hatzilygeroudis, I.: Deep learning approaches for facial emotion recognition: a case study on FER-2013. In: Hatzilygeroudis, I., Palade, V. (eds.) Advances in Hybridization of Intelligent Methods, pp. 1–8. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-66790-4_1

  22. Calvo, M.G., Lundqvist, D.: Facial expressions of emotion (KDEF): identification under different display-duration conditions. Behav. Res. Methods 40(1), 109–115 (2008)

    Article  Google Scholar 

  23. Adobe Stock. https://stock.adobe.com/. Accessed 28 Sept 2022

  24. Shutterstock. https://www.shutterstock.com/. Accessed 24 Sept 2022

  25. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57, 137–154 (2004)

    Article  Google Scholar 

  26. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)

    Article  Google Scholar 

  27. Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Muller, K.R. (eds.) Neural Networks: Tricks of the Trade, vol. 7700, 2nd edn., pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25

    Chapter  Google Scholar 

  28. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  29. Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)

    Google Scholar 

  30. He, K., Zhang, X., Ren, S., Sun, J.: Identity map**s in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  31. **e, S., Girshick, R., Dollá=r, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)

    Google Scholar 

  32. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepti Mishra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahmed, T.U., Mishra, D. (2024). FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2024. Lecture Notes in Computer Science(), vol 14736. Springer, Cham. https://doi.org/10.1007/978-3-031-60615-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-60615-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-60614-4

  • Online ISBN: 978-3-031-60615-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation