Log in

A Real-time Multimodal Intelligent Tutoring Emotion Recognition System (MITERS)

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Emotion recognition can be used in a wide range of applications. We are interested in the E-learning system because of the several benefits of learning anywhere and anytime. Despite important advances, students’ emotions can influence the learning process, and this is the case with both traditional learning as well as E-learning. Emotion can limit and blocks our ability to learn, think, and solve problems. On the contrary, it can drive us to success by boosting our innate mental ability when we love what we do and when we are affected by happiness and excitement. In recent years, a large number of studies have addressed the problem of emotion recognition based on different modalities. But the information provided by student emotion recognition based on single-modal data like face is insufficient. Additionally, selecting one affective state over another might be challenging at times. To remove these ambiguities, we propose an Intelligent Affective Tutoring System named "Multimodal Intelligent Tutoring Emotion Recognition System" MITERS that merges three modalities such as face, text, and speech simultaneously. Our System is a real-time-based system that detects the emotion of students and gives adequate feedback. For this purpose, we use deep learning techniques. Among these are (a) Deep Convolution Neural Network (DCNN) used to detect emotion from face modality, (b) Bidirectional Long Short Memory (BiLSTM) for predicting emotions from text information, and (c) Convolutional Neural Network (CNN) which is used to detect emotions from speech modality. The experimental results are compared with some of the well-known approaches and the proposed MITERS has performed well with a classification accuracy of 97% in MELD which is a multimodal database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

All data generated or analyzed during this study are included in this published article.

References

  1. Akputu OK, Inyang UG, Msugh O, Mughal FT, Usoro A (2022) Recognizing facial emotions for educational learning settings. IAES Int J Robot Autom 11(1):21

    Google Scholar 

  2. Litman D, Forbes K (2003) Recognizing emotions from student speech in tutoring dialogues. In 2003 IEEE workshop on automatic speech recognition and understanding (IEEE Cat. No. 03EX721) (pp 25–30), 2003, November, IEEE

  3. Reimers F, Schleicher A, Saavedra J, Tuominen S (2020) Supporting the continuation of teaching and learning during the COVID-19 Pandemic. Oecd 1(1):1–38

    Google Scholar 

  4. Hazarika D, Boruah A, Puzari R (2022) Growth of Edtech market in India: a study on pre-pandemic and ongoing pandemic situation. J Posit School Psychol 6(3):5291–5303

    Google Scholar 

  5. Khediri N, Ammar MB, Kherallah M (2023) Deep-Learning Based Approach to Facial Emotion Recognition Through Convolutional Neural Network. Int J Comput Inf Eng 17(2):132–136

    Google Scholar 

  6. Choi JH, Lee JS (2019) EmbraceNet: A robust deep learning architecture for multimodal classification. Inf Fusion 51:259–270

    Article  Google Scholar 

  7. Cristinacce D, Cootes T (2008) Automatic feature localisation with constrained local models. J Pattern Recognit 41(10):3054–3067

    Article  Google Scholar 

  8. De Carolis B, D’Errico F, Macchiarulo N, Paciello M, Palestra G (2021) Recognizing cognitive emotions in e-learning environment. International Workshop on Higher Education Learning Methodologies and Technologies Online. Springer, Cham, pp 17–27

    Google Scholar 

  9. D’Mello SK, Dowell N, Graesser AC (2011) Does It Really Matter Whether Students’ Contributions Are Spoken versus Typed in an Intelligent Tutoring System with Natural Language? J Exp Psychol Appl 17(1):1–17

    Article  Google Scholar 

  10. Le TH, Tran HN, Nguyen PD, Nguyen HQ, Nguyen TB, Tran TH, Vu H, Tran TT, Le TL (2022) Spatial and temporal hand-raising recognition from classroom videos using locality, relative position-aware non-local networks and hand tracking. Vietnam J Comput Sci 1–29

  11. Filali H, Riffi J, Boulealam C, Mahraz MA, Tairi H (2022) Multimodal Emotional Classification Based on Meaningful Learning. Big Data Cognit Comput 6(3):95

    Article  Google Scholar 

  12. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML, pp 807-814

  13. Ghosal D, Majumder N, Gelbukh A, Mihalcea R, Poria S (2020) COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. EMNLP. Online, Association for Computational Linguistics, Findings of the Association for Computational Linguistics, pp 2470–2481

    Google Scholar 

  14. Veni S, Anand R, Mohan D, PAUL E (2021) Feature fusion in multimodal emotion recognition system for enhancement of human-machine interaction. In IOP conference series: materials science and engineering (vol 1084, no 1, p 012004). IOP Publishing, March, 2021

  15. Cao S, Guo D, Cao L, Li S, Nie J, Singh AK, Lv H (2022) VisDmk: visual analysis of massive emotional danmaku in online videos. Vis Comput, pp.1-18

  16. Hazarika D, Boruah A, Puzari R (2022) Growth of Edtech market in India: a study on pre-pandemic and ongoing pandemic situation. J Posit School Psychol 6(3):5291–5303

    Google Scholar 

  17. Hua A, Litman DJ, Forbes-Riley K, Rotaru M, Tetreault J, Purandare A (2006) Using system and user performance features to improve emotion detection in spoken tutoring dialogs. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2:797–800

    Google Scholar 

  18. Khediri N, Ammar MB, Kherallah M (2023) Deep-Learning Based Approach to Facial Emotion Recognition Through Convolutional Neural Network. Int J Comput Inf Eng 17(2):132–136

    Google Scholar 

  19. Cao S, Guo D, Cao L, Li S, Nie J, Singh AK, Lv H (2022) VisDmk: visual analysis of massive emotional danmaku in online videos. Vis Comput 1–18

  20. Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R (2018) Meld: A multimodal multi-party dataset for emotion recognition in conversations, In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 527–536, Florence, Italy. Association for Computational Linguistics, ar**v preprint (ar**v:1810.02508)

  21. Khediri N, Ammar MB, Kherallah M (2022) A new deep learning fusion approach for emotion recognition based on face and text. In Computational collective intelligence: 14th International conference, (ICCCI 2022), Hammamet, Tunisia, 28-30 Sept 2022, proceedings (vol 13501, p 75, Springer Nature)

  22. Wang H, Tlili A, Huang R, Cai Z, Li M, Cheng Z, Yang D, Li M, Zhu X, Fei C (2023) Examining the applications of intelligent tutoring systems in real educational contexts: A systematic literature review from the social experiment perspective, Education and Information Technologies, pp.1-36

  23. Lin HCK, Wang CH, Chao CJ, Chien MK (2012) Employing Textual and Facial Emotion Recognition to Design an Affective Tutoring System. Turkish Online J Educ Technol-TOJET 11(4):418–426

    Google Scholar 

  24. Muthamilselvan T, Brindha K, Senthilkumar S, Chatterjee JM, Hu YC (2022) Optimized face-emotion learning using convolutional neural network and binary whale optimization. Multimed Tools Appl 1–24

  25. Alim SA, Rashid NKA (2018) Some commonly used speech feature extraction algorithms. In From natural to artificial intelligence- algorithms and applications. London, United Kingdom: IntechOpen, 2018 [Online]. Available: https://www.intechopen.com/chapters/63970. https://doi.org/10.5772/intechopen.80419

  26. Liu M, Yu D (2022) Towards intelligent E-learning systems. Educ Inf Technol. https://doi.org/10.1007/s10639-022-11479-6

    Article  Google Scholar 

  27. Ben Ammar M, Neji M, Alimi AM, Gouarderes G (2010) The Affective Tutoring System. Expert Systems with Applications 37:3013–3023

    Article  Google Scholar 

  28. Khediri N, Ben Ammar M, Kherallah M (2017) Towards an online Emotional Recognition System for Intelligent Tutoring Environment, The International Arab Conference on Information Technology, ACIT’2017, Yassmine Hammamet, Tunisia, December 22–24

  29. D’errico F, Paciello M, De Carolis B, Vattanid A, Palestra G, Anzivino G (2018) Cognitive emotions in e-learning processes and their potential relationship with students’ academic adjustment. Int J Emotional Education, (Special issue volume 10, number 1, ISSN 2073-7629, April 2018 pp 89–111)

  30. Luna-Jiménez C, Griol D, Callejas Z, Kleinlein R, Montero JM, Fernández-Martínez F (2021) Multimodal emotion recognition on ravdess dataset using transfer learning. Sensors 21(22):7665

    Article  Google Scholar 

  31. Khediri N, Ben Ammar M, Kherallah M (2021) Comparison of image segmentation using different color spaces. In: 2021 IEEE 21st International conference on communication technology, (ICCT2021, Tian**, China, 13-16 October 2021)

  32. Ma W, Adesope OO, Nesbit JC, Liu Q (2014) Intelligent tutoring systems and learning outcomes: A meta-analysis. J Educ Psychol 106(4):901–918

    Article  Google Scholar 

  33. Maatuk AM, Elberkawi EK, Aljawarneh S et al (2022) The COVID-19 pandemic and E-learning: challenges and opportunities from the perspective of students and instructors. J Comput High Educ 34:21–38. https://doi.org/10.1007/s12528-021-09274-2

    Article  Google Scholar 

  34. Lam L, Suen CY (1994) A theoretical analysis of the application of majority voting to pattern recognition. In Proceedings of the 12th IAPR international conference on pattern recognition, vol. 3-conference C: signal processing (Cat. No. 94CH3440-5) (vol 2, pp 418–420), October. IEEE

  35. Mousavinasab E, Zarifsanaiey N, Niakan Kalhori SR, Rakhshan M, Keikha L, Ghazi Saeedi M (2021) Intelligent tutoring systems: a systematic review of characteristics, applications, and evaluation methods. Interact Learn Environ 29(1):142–163

    Article  Google Scholar 

  36. Lin HCK, Wang CH, Chao CJ, Chien MK (2012) Employing Textual and Facial Emotion Recognition to Design an Affective Tutoring System. Turkish Online J Educ Technol-TOJET 11(4):418–426

    Google Scholar 

  37. Filali H, Riffi J, Boulealam C, Mahraz MA, Tairi H (2022) Multimodal Emotional Classification Based on Meaningful Learning. Big Data Cognit Comput 6(3):95

    Article  Google Scholar 

  38. Petrakos M, Benediktsson JA, Kanellopoulos I (2001) The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Trans Geosci Remote Sens 39(11):2539–2546

    Article  Google Scholar 

  39. Tang K, Tie Y, Yang T, Guan L (2014) Multimodal emotion recognition (MER) system. In 2014 IEEE 27th Canadian conference on electrical and computer engineering (CCECE) (pp 1–6). IEEE

  40. Petrovica S, Anohina-Naumeca A, Ekenel HK (2017) Emotion recognition in affective tutoring systems: Collection of ground-truth data. Procedia Comput Sci 104:437–444

    Article  Google Scholar 

  41. Hua A, Litman DJ, Forbes-Riley K, Rotaru M, Tetreault J, Purandare A (2006) Using system and user performance features to improve emotion detection in spoken tutoring dialogs. In Proceedings of the annual conference of the international speech communication association, INTERSPEECH (vol 2, pp 797–800)

  42. Poria S, Hazarika D, Majumder N, Naik G, Cambria E, Mihalcea R (2018) Meld: a multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of the 57th annual meeting of the association for computational linguistics, pp 527–536, Florence, Italy. Association for computational linguistics. ar**v preprint (ar**v:1810.02508, 2018 Oct 5)

  43. Ratnadeep D, Kishori G (2015) Feature Extraction Techniques for Speech Recognition: A Review. Int J Sci Eng Res 6:143–147

    Google Scholar 

  44. Bahreini K, Nadolski R, Westera W (2016) Data fusion for real-time multimodal emotion recognition through webcams and microphones in e-learning. Int J Human-Comput Inter 32(5):415–430

    Article  Google Scholar 

  45. Reimers F, Schleicher A, Saavedra J, Tuominen S (2020) Supporting the continuation of teaching and learning during the COVID-19 Pandemic. Oecd 1(1):1–38

    Google Scholar 

  46. Siddiqui HUR, Zafar K, Saleem AA, Raza MA, Dudley S, Rustam F, Ashraf I (2023) Emotion classification using temporal and spectral features from IR-UWB-based respiration data. Multimed Tools Appl 82(12):18565–18583

    Article  Google Scholar 

  47. Cassano F, Piccinno A, Roselli T, Rossano V (2019) Gamification and learning analytics to improve engagement in university courses. In Methodologies and intelligent systems for technology enhanced learning, 8th international conference 8 pp 156–63. Springer international publishing

  48. Sekkate S, Khalil M, Adib A (2022) A statistical feature extraction for deep speech emotion recognition in a bilingual scenario. Multimed Tools Appl 1–18

  49. Namrata D (2013) Feature extraction methods LPC, PLP and MFCC in speech recognition. Int J Adv Res Eng Technol (ISSN 2320-6802, volume 1)

  50. Siriwardhana S, Kaluarachchi T, Billinghurst M, Nanayakkara S (2020) Multimodal emotion recognition with transformer-based self supervised feature fusion. IEEE Access 8:176274–176285

    Article  Google Scholar 

  51. Ghosal D, Majumder N, Gelbukh A, Mihalcea R, Poria S (2020) COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. Findings of the Association for Computational Linguistics: EMNLP. Online, Association for Computational Linguistics, pp 2470–2481

    Chapter  Google Scholar 

  52. De Carolis B, D’Errico F, Macchiarulo N, Paciello M, Palestra G (2021) Recognizing cognitive emotions in e-learning environment, In International Workshop on Higher Education Learning Methodologies and Technologies Online. Springer, Cham, pp 17–27

    Google Scholar 

  53. Nandi A, Xhafa F, Subirats L, Fort S (2020) A survey on multimodal data stream mining for e-learner’s emotion recognition. In: 2020 International conference on omni-layer intelligent systems (COINS). pp 1–6

  54. **e B, Sidulova M, Park CH (2021) Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion. Sensors. 21(14):4913. https://doi.org/10.3390/s21144913

    Article  Google Scholar 

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research work through the project number “NBU-FFR-2023-0133”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nouha Khediri.

Ethics declarations

Competing Interests

The authors have no competing interests to declare relevant to this article’s content.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khediri, N., Ben Ammar, M. & Kherallah, M. A Real-time Multimodal Intelligent Tutoring Emotion Recognition System (MITERS). Multimed Tools Appl 83, 57759–57783 (2024). https://doi.org/10.1007/s11042-023-16424-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16424-4

Keywords

Navigation