FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper

Ahmed, Tawsin Uddin; Mishra, Deepti

doi:10.1007/978-3-031-60615-1_13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14736))

Included in the following conference series:

International Conference on Human-Computer Interaction

291 Accesses

Abstract

The ability to equip robots with social skills in terms of making human-robot interaction more natural, authentic, and lifelike is a challenging task in the domain of human-robot communication. A key component in doing this is the robot’s aptitude to perceive and understand human emotional states. In the larger domains of human-machine interaction and affective computing, emotion detection has received a lot of attention. In this research, an improved facial expression recognition framework is developed for the humanoid robot Pepper that allows Pepper to recognize human facial emotions beyond seven basic expressions. Three unique facial expressions mockery, think and wink are introduced along with seven basic expressions anger, disgust, happy, neutral, fear, sad and surprise. Several deep learning models, transformer: MobileNetV2, Residual attention network, Vision transformer (ViT) and EfficientNetV2 are assigned to this Facial Emotion Recognition (FER) task during the experiment. EfficientNetV2 is proved to be more robust in FER outperforming other candidate models achieving validation accuracy, recall and F1 score of 88.23%, 88.61% and 88.19% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 63.34; Price includes VAT (Germany)

Softcover Book: EUR 79.17; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goodrich, M.A., Schultz, A.C.: Human-Robot Interaction: A Survey. Now Publishers Inc., Norwell (2008)
Google Scholar
Sheridan, T.B.: Human-robot interaction: status and challenges. Hum. Fact. 58(4), 525–532 (2016)
Article Google Scholar
Camras, L.A., Sachs-Alter, E., Ribordy, S.C.: Emotion Understanding in Maltreated Children: Recognition of Facial Expressions and Integration with Other Emotion Cues. Lawrence Erlbaum Associates Inc., Mahwah (1996)
Google Scholar
Sikka, K.: Facial expression analysis for estimating pain in clinical settings. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 349–353 (2014)
Google Scholar
Pierson, H.A., Gashler, M.S.: Deep learning in robotics: a review of recent research. Adv. Rob. 31(16), 821–835 (2017)
Article Google Scholar
Károly, A.I., Galambos, P., Kuti, J., Rudas, I.J.: Deep learning in robotics: survey on model structures and training strategies. IEEE Trans. Syst. Man Cybern. Syst. 51(1), 266–279 (2020)
Article Google Scholar
Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial expression recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 336–341 (2019)
Google Scholar
Ahmed, T.U., Jamil, M.N., Hossain, M.S., Andersson, K., Hossain, M.S.: An integrated real-time deep learning and belief rule base intelligent system to assess facial expression under uncertainty. In: 2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–6 (2020)
Google Scholar
Ni, R., Yang, B., Zhou, X., Cangelosi, A., Liu, X.: Facial expression recognition through cross-modality attention fusion. IEEE Trans. Cogn. Dev. Syst. 15(1), 175–185 (2022)
Article Google Scholar
Ruiz-Garcia, A., Webb, N., Palade, V., Eastwood, M., Elshaw, M.: Deep learning for real time facial expression recognition in social robots. In: Cheng, L., Leung, A., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 392–402. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-030-04221-9_35
Chapter Google Scholar
Zhang, J., **ao, N.: Capsule network-based facial expression recognition method for a humanoid robot. In: Recent Trends in Intelligent Computing, Communication and Devices, pp. 113–121. Springer, Heidelberg (2020). https://doi.org/10.1007/978-981-13-9406-5_15
Ilyas, C.M.A., Schmuck, V., Haque, M.A., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Teaching pepper robot to recognize emotions of traumatic brain injured patients using deep neural networks. In: 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp. 1–7. IEEE (2019)
Google Scholar
Spezialetti, M., Placidi, G., Rossi, S.: Emotion recognition for human-robot interaction: recent advances and future perspectives. Front. Rob. AI 7, 532279 (2020)
Article Google Scholar
Alonso-Martin, F., Malfaz, M., Sequeira, J., Gorostiza, J.F., Salichs, M.A.: A multimodal emotion detection system during human-robot interaction. Sensors 13(11), 15549–15581 (2013)
Article Google Scholar
Pandey, A.K., Gelin, R.: A mass-produced sociable humanoid robot: pepper: The first machine of its kind. IEEE Rob. Autom. Maga. 25(3), 40–48 (2018)
Article Google Scholar
Seo, K.I.S.U.N.G., Robotics, A.L.D.E.B.A.R.A.N.: Using nao: introduction to interactive humanoid robots. AldeBaran Rob. (2013)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: Thirty-First AAAI Conference on Artificial Intelligence. ar**v preprint ar**v:2010.11929 (2020)
Tan, M., Le, Q.: Efficientnetv2: smaller models and faster training. In:: International Conference on Machine Learning (2021)
Google Scholar
Aifanti, N., Papachristou, C., Delopoulos, A.: The MUG facial expression database. In: 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, pp. 1–4 (2010)
Google Scholar
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2007)
Google Scholar
Giannopoulos, P., Perikos, I., Hatzilygeroudis, I.: Deep learning approaches for facial emotion recognition: a case study on FER-2013. In: Hatzilygeroudis, I., Palade, V. (eds.) Advances in Hybridization of Intelligent Methods, pp. 1–8. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-66790-4_1
Calvo, M.G., Lundqvist, D.: Facial expressions of emotion (KDEF): identification under different display-duration conditions. Behav. Res. Methods 40(1), 109–115 (2008)
Article Google Scholar
Adobe Stock. https://stock.adobe.com/. Accessed 28 Sept 2022
Shutterstock. https://www.shutterstock.com/. Accessed 24 Sept 2022
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57, 137–154 (2004)
Article Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)
Article Google Scholar
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Muller, K.R. (eds.) Neural Networks: Tricks of the Trade, vol. 7700, 2nd edn., pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Chapter Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity map**s in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
**e, S., Girshick, R., Dollá=r, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Norwegian University of Science and Technology, Teknologivegen 22, Gjøvik, Norway
Tawsin Uddin Ahmed & Deepti Mishra

Authors

Tawsin Uddin Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Deepti Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepti Mishra .

Editor information

Editors and Affiliations

Siemens Corporation, Princeton, NJ, USA
Helmut Degen
Foundation for Research and Technology - FORTH, Heraklion, Crete, Greece
Stavroula Ntoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, T.U., Mishra, D. (2024). FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2024. Lecture Notes in Computer Science(), vol 14736. Springer, Cham. https://doi.org/10.1007/978-3-031-60615-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-60615-1_13
Published: 23 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-60614-4
Online ISBN: 978-3-031-60615-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper