Abstract
The ability to equip robots with social skills in terms of making human-robot interaction more natural, authentic, and lifelike is a challenging task in the domain of human-robot communication. A key component in doing this is the robot’s aptitude to perceive and understand human emotional states. In the larger domains of human-machine interaction and affective computing, emotion detection has received a lot of attention. In this research, an improved facial expression recognition framework is developed for the humanoid robot Pepper that allows Pepper to recognize human facial emotions beyond seven basic expressions. Three unique facial expressions mockery, think and wink are introduced along with seven basic expressions anger, disgust, happy, neutral, fear, sad and surprise. Several deep learning models, transformer: MobileNetV2, Residual attention network, Vision transformer (ViT) and EfficientNetV2 are assigned to this Facial Emotion Recognition (FER) task during the experiment. EfficientNetV2 is proved to be more robust in FER outperforming other candidate models achieving validation accuracy, recall and F1 score of 88.23%, 88.61% and 88.19% respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goodrich, M.A., Schultz, A.C.: Human-Robot Interaction: A Survey. Now Publishers Inc., Norwell (2008)
Sheridan, T.B.: Human-robot interaction: status and challenges. Hum. Fact. 58(4), 525–532 (2016)
Camras, L.A., Sachs-Alter, E., Ribordy, S.C.: Emotion Understanding in Maltreated Children: Recognition of Facial Expressions and Integration with Other Emotion Cues. Lawrence Erlbaum Associates Inc., Mahwah (1996)
Sikka, K.: Facial expression analysis for estimating pain in clinical settings. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 349–353 (2014)
Pierson, H.A., Gashler, M.S.: Deep learning in robotics: a review of recent research. Adv. Rob. 31(16), 821–835 (2017)
Károly, A.I., Galambos, P., Kuti, J., Rudas, I.J.: Deep learning in robotics: survey on model structures and training strategies. IEEE Trans. Syst. Man Cybern. Syst. 51(1), 266–279 (2020)
Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial expression recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 336–341 (2019)
Ahmed, T.U., Jamil, M.N., Hossain, M.S., Andersson, K., Hossain, M.S.: An integrated real-time deep learning and belief rule base intelligent system to assess facial expression under uncertainty. In: 2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–6 (2020)
Ni, R., Yang, B., Zhou, X., Cangelosi, A., Liu, X.: Facial expression recognition through cross-modality attention fusion. IEEE Trans. Cogn. Dev. Syst. 15(1), 175–185 (2022)
Ruiz-Garcia, A., Webb, N., Palade, V., Eastwood, M., Elshaw, M.: Deep learning for real time facial expression recognition in social robots. In: Cheng, L., Leung, A., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 392–402. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-030-04221-9_35
Zhang, J., **ao, N.: Capsule network-based facial expression recognition method for a humanoid robot. In: Recent Trends in Intelligent Computing, Communication and Devices, pp. 113–121. Springer, Heidelberg (2020). https://doi.org/10.1007/978-981-13-9406-5_15
Ilyas, C.M.A., Schmuck, V., Haque, M.A., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Teaching pepper robot to recognize emotions of traumatic brain injured patients using deep neural networks. In: 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp. 1–7. IEEE (2019)
Spezialetti, M., Placidi, G., Rossi, S.: Emotion recognition for human-robot interaction: recent advances and future perspectives. Front. Rob. AI 7, 532279 (2020)
Alonso-Martin, F., Malfaz, M., Sequeira, J., Gorostiza, J.F., Salichs, M.A.: A multimodal emotion detection system during human-robot interaction. Sensors 13(11), 15549–15581 (2013)
Pandey, A.K., Gelin, R.: A mass-produced sociable humanoid robot: pepper: The first machine of its kind. IEEE Rob. Autom. Maga. 25(3), 40–48 (2018)
Seo, K.I.S.U.N.G., Robotics, A.L.D.E.B.A.R.A.N.: Using nao: introduction to interactive humanoid robots. AldeBaran Rob. (2013)
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: Thirty-First AAAI Conference on Artificial Intelligence. ar**v preprint ar**v:2010.11929 (2020)
Tan, M., Le, Q.: Efficientnetv2: smaller models and faster training. In:: International Conference on Machine Learning (2021)
Aifanti, N., Papachristou, C., Delopoulos, A.: The MUG facial expression database. In: 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, pp. 1–4 (2010)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2007)
Giannopoulos, P., Perikos, I., Hatzilygeroudis, I.: Deep learning approaches for facial emotion recognition: a case study on FER-2013. In: Hatzilygeroudis, I., Palade, V. (eds.) Advances in Hybridization of Intelligent Methods, pp. 1–8. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-66790-4_1
Calvo, M.G., Lundqvist, D.: Facial expressions of emotion (KDEF): identification under different display-duration conditions. Behav. Res. Methods 40(1), 109–115 (2008)
Adobe Stock. https://stock.adobe.com/. Accessed 28 Sept 2022
Shutterstock. https://www.shutterstock.com/. Accessed 24 Sept 2022
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57, 137–154 (2004)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Muller, K.R. (eds.) Neural Networks: Tricks of the Trade, vol. 7700, 2nd edn., pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Identity map**s in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-46493-0_38
**e, S., Girshick, R., Dollá=r, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ahmed, T.U., Mishra, D. (2024). FER-Pep: A Deep Learning Based Facial Emotion Recognition Framework for Humanoid Robot Pepper. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2024. Lecture Notes in Computer Science(), vol 14736. Springer, Cham. https://doi.org/10.1007/978-3-031-60615-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-60615-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-60614-4
Online ISBN: 978-3-031-60615-1
eBook Packages: Computer ScienceComputer Science (R0)