Log in

Exploring multimodal data analysis for emotion recognition in teachers’ teaching behavior based on LSTM and MSCNN

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The diverse range of emotions exhibited in instructional behavior exerts a profound influence on the effectiveness of teaching and the cognitive state of learners. By leveraging an emotion recognition model, we can analyze the invaluable feedback information derived from teaching behavior data, thereby facilitating the enhancement of pedagogical effectiveness. However, conventional emotion recognition models fall short in capturing the intricate emotional features and subtle nuances inherent in teaching behavior, thereby hindering the accuracy of emotion classification. In light of this, we propose a groundbreaking multimodal emotion recognition model for teaching behavior, founded upon the Long Short-Term Memory (LSTM) and Multi-Scale Convolutional Neural Network (MSCNN). Our approach involves extracting both low-level and high-level local features from text, audio, and image modalities, utilizing LSTM and MSCNN, respectively. Subsequently, a transformer encoder is employed to fuse the extracted features, which are then fed into a fully connected layer for emotion recognition. Experimental results affirm that our proposed model attains an accuracy rate of 84.5% and an F1 score of 84.1% on a self-curated dataset, surpassing other comparative models. These outcomes unequivocally establish the efficacy and superiority of our emotion recognition model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

Datasets can be provided by corresponding authors upon request.

References

  • Abdullah M, Ahmad M, Han D, (2021) Hierarchical attention approach in multimodal emotion recognition for human robot interaction[C]. In: 2021 36th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC)

  • Abdullah S, Ameen S, Sadeeq M et al (2021) Multimodal emotion recognition using deep learning[J]. J Appl Sci Technol Trends 2(02):52–58

    Article  Google Scholar 

  • Balalahadia FF, Fernando M, Juanatas IC, (2016) Teacher's performance evaluation tool using opinion mining with sentiment analysis[C]. In: Region 10 Symposium. IEEE

  • Bansal AK, Ghayoumi M, (2016) Multimodal architecture for emotion in robots using deep learning[C]. In: Future Technologies Conference. IEEE

  • Cabada RZ, Estrada MLB, Bustillos RO (2018) Mining of educational opinions with deep learning[J]. J Univers Comput Sci 11:1604–1626

    Google Scholar 

  • Chen Z, Li J, Liu H et al (2023) Learning multi-scale features for speech emotion recognition with connection attention mechanism[J]. Expert Syst Appl 214:118943

    Article  Google Scholar 

  • Devlin J, Chang M W, Lee K, et al. (2018) Bert: Pretraining of deep bidirectional transformers for language under standing[J]. ar**v preprint ar**v: 1810.04805

  • Estrada MLB, Cabada RZ, Bustillos RO et al (2020) Opinion mining and emotion recognition applied to learning environments [J]. Expert Syst Appl 150:113265

    Article  Google Scholar 

  • Fontan ME, Senoran GM, Rodriguez CA et al (2019) Teachers’ ICT-related Self-efficacy job resources and positive emotions: their structural relations with autonomous motivation and work engagement [J]. Comput Educ 134(6):63–77

    Article  Google Scholar 

  • Gutierrez G, Canul Eich J, Zezzati AO et al (2018) Mining: students comments about teacher performance assessment using machine learning algorithms[J]. J Combin Optim Problems Inform 9(3):26

    Google Scholar 

  • Jiang R, Xu HY, Gong GL et al (2022) Spatial-temporal attentive LSTM for vehicle-trajectory prediction[J]. ISPRS Int J Geo Inf 11(7):354

    Article  Google Scholar 

  • Kim Y, Toyota T, Behnagh R (2018) Towards emotionally aware AI smart classroom: current issues and directions for engineering and education [J]. IEEE Access 6(1):5308–5331

    Article  Google Scholar 

  • Li F, (2021) Research on association rules mining of online teaching evaluation data and its application in teaching quality improvement [D]. Bei**g University of Posts and Telecommunications

  • Lin QK, Zhu YF, Zhang SF et al (2019) Lexical based automated teaching evaluation via students’ short reviews[J]. Comput Appl Eng Educ 27(1):194–205

    Article  Google Scholar 

  • Liu SH, Sun X, Li CB (2021) Emotion recognition using EEG signals based on location information reconstruction and time-frequency information fusion[J]. Comput Eng 47(12):95–102

    Google Scholar 

  • Loic K, Castellano G et al (2009) Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis[J]. J Multimod User Interfaces 3:33–48

    Google Scholar 

  • Ngunyen D, Nguyen K, Sridharan S, et al. (2017) Deep spatio-temporal features for multimodal emotion recognition[C]. In: 2017 IEEE Winter Conference on Applications of Computer Vision. IEEE

  • Nie WZ, Yan Y, Dan S et al (2020) Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition[J]. Multimed Tools Appl 80:16205

    Article  Google Scholar 

  • Park S, Ryu J (2019) Exploring preservice teachers’ emotional experiences in an immersive virtual teaching simulation through facial expression recognition[J]. Int J Hum Comput Interact 35(6):521–533

    Article  Google Scholar 

  • Ranganathan H, Chakraborty S, Panchanathan S, (2017) Transfer of multimodal emotion features in deep belief networks[C]. In: Conference on Signals, Systems & Computers. IEEE

  • Su H, Liu B, Tao J, et al. (2020) An improved multimodal dimension emotion recognition based on different fusion methods[C]. In: 2020 15th IEEE International Conference on Signal Processing (ICSP). IEEE

  • Tan Y, Sun Z, Duan F et al (2021) A multimodal emotion recognition method based on facial expressions and electroencephalography[J]. Biomed Signal Process Control 70:103029

    Article  Google Scholar 

  • Tsai YH, Bai S, Liang P, et al. (2019) Multimodal transformer for unaligned multimodal language sequences[C]. In: Proceedings of the conference. Association for Computational Linguistics. Meeting. NIH Public Access, p. 6558

  • Uzuntiryaki Kondakci E, Kirbulut ZD, Sarici E et al (2019) Emotion regulation as a mediator of the influence of science teacher emotions on teacher efficacy beliefs[J]. Educ Stud 48(5):583–601

    Article  Google Scholar 

  • Wang BH, **ong Y, Yao Y et al (2021) Affective analysis of students’ teaching evaluation based on deep learning [J]. Res Audio vis Educ 42(04):101–107

    Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China, and the project number is 6207020477.

Author information

Authors and Affiliations

Authors

Contributions

YL contributed to the material preparation; ZC contributed to the data collection and analysis and the writing of the first draft; QZ contributed to the design; YZ contributed to methodology and resources; MW contributed to the study conception. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zengzhao Chen.

Ethics declarations

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Y., Chen, Z., Zheng, Q. et al. Exploring multimodal data analysis for emotion recognition in teachers’ teaching behavior based on LSTM and MSCNN. Soft Comput (2023). https://doi.org/10.1007/s00500-023-08760-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00500-023-08760-2

Keywords

Navigation