Deep Learning Based Emotion Recognition from Chinese Speech

Zhang, Weishan; Zhao, Dehai; Chen, **ufeng; Zhang, Yuanjie

doi:10.1007/978-3-319-39601-9_5

Weishan Zhang¹⁹,
Dehai Zhao¹⁹,
**ufeng Chen²⁰ &
…
Yuanjie Zhang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9677))

Included in the following conference series:

International Conference on Smart Homes and Health Telematics

2556 Accesses
2 Citations

Abstract

Emotion Recognition is challenging for understanding people and enhance human computer interaction experiences. In this paper, we explore deep belief networks (DBN) to classify six emotion status: anger, fear, joy, neutral status, sadness and surprise using different features fusion. Several kinds of speech features such as Mel frequency cepstrum coefficient (MFCC), pitch, formant, et al., were extracted and combined in different ways to reflect the relationship between feature combinations and emotion recognition performance. We adjusted different parameters in DBN to achieve the best performance when solving different emotions. Both gender dependent and gender independent experiments were conducted on the Chinese Academy of Sciences emotional speech database. The highest accuracy was 94.6 %, which was achieved using multi-feature fusion. The experiment results show that DBN based approach has good potential for practical usage of emotion recognition, and suitable multi-feature fusion will improve the performance of speech emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

Emotion recognition of speech signal using Taylor series and deep belief network based classification

Article 06 January 2020

Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition

Article 02 December 2017

Notes

1.
http://www.datatang.com/data/39277.

References

Aher, P., Cheeran, A.: Auditory processing of speech signals for speech emotion recognition (2014)
Google Scholar
Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70(3), 614 (1996)
Article Google Scholar
Chu, Y.Y., **ong, W.H., Chen, W.: Speech emotion recognition based on EMD in noisy environments. In: Advanced Materials Research, vol. 831, pp. 460–464. Trans Tech Publication (2014)
Google Scholar
El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011)
Article MATH Google Scholar
France, D.J., Shiavi, R.G., Silverman, S., Silverman, M., Wilkes, D.M.: Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Trans. Biomed. Eng. 47(7), 829–837 (2000)
Article Google Scholar
Hinton, G.E.: Deep belief networks. Scholarpedia 4(5), 5947 (2009)
Article Google Scholar
**, Q., Li, C., Chen, S., Wu, H.: Speech emotion recognition with acoustic and lexical features. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4749–4753. IEEE (2015)
Google Scholar
Joshi, D.D., Zalte, M.: Recognition of emotion from marathi speech using MFCC and DWT algorithms (2013)
Google Scholar
Kim, Y., Lee, H., Provost, E.M.: Deep learning for robust feature generation in audiovisual emotion recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3687–3691. IEEE (2013)
Google Scholar
Lee, C.C., Mower, E., Busso, C., Lee, S., Narayanan, S.: Emotion recognition using a hierarchical binary decision tree approach. Speech Commun. 53(9), 1162–1171 (2011)
Article Google Scholar
Mohamed, A.R., Dahl, G.E., Hinton, G.: Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)
Article Google Scholar
Picard, R.W., Picard, R.: Affective Computing, vol. 252. MIT Press, Cambridge (1997)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 1, pp. I-577. IEEE (2004)
Google Scholar
Shen, P., Changjun, Z., Chen, X.: Automatic speech emotion recognition using support vector machine. In: 2011 International Conference on Electronic and Mechanical Engineering and Information Technology (EMEIT), vol. 2, pp. 621–625. IEEE (2011)
Google Scholar
Utane, A.S., Nalbalwar, S.: Emotion recognition through speech using gaussian mixture model and support vector machine. Emotion 2, 8 (2013)
Google Scholar
Williams, C.E., Stevens, K.N.: Emotions and speech: some acoustical correlates. J. Acoust. Soc. Am. 52(4B), 1238–1250 (1972)
Article Google Scholar
Wu, S., Falk, T.H., Chan, W.Y.: Automatic speech emotion recognition using modulation spectral features. Speech Commun. 53(5), 768–785 (2011)
Article Google Scholar

Download references

Acknowledgement

This research was supported by the International S&T Cooperation Program of China (ISTCP, 2013DFA10980).

Author information

Authors and Affiliations

Department of Software Engineering, China University of Petroleum, No.66 Changjiang West Road, Qingdao, 266580, China
Weishan Zhang, Dehai Zhao & Yuanjie Zhang
Hisense TransTech Co., Ltd., No.16 Shandong Road, Qingdao, China
**ufeng Chen

Authors

Weishan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dehai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
**ufeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuanjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weishan Zhang .

Editor information

Editors and Affiliations

Iowa State University, Ames, Iowa, USA
Carl K. Chang
University of Bologna, Bologna, Italy
Lorenzo Chiari
The University of Massachusetts, Lowell, Massachusetts, USA
Yu Cao
Huazhong Univ. of Science and Technology, Wuhan, China
Hai **
Institut Mines Télécom Paris/CNRS, Paris, France
Mounir Mokhtari
Institut Mines Télécom, Paris, France
Hamdi Aloulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, W., Zhao, D., Chen, X., Zhang, Y. (2016). Deep Learning Based Emotion Recognition from Chinese Speech. In: Chang, C., Chiari, L., Cao, Y., **, H., Mokhtari, M., Aloulou, H. (eds) Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture Notes in Computer Science(), vol 9677. Springer, Cham. https://doi.org/10.1007/978-3-319-39601-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-39601-9_5
Published: 21 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39600-2
Online ISBN: 978-3-319-39601-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Learning Based Emotion Recognition from Chinese Speech

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

Emotion recognition of speech signal using Taylor series and deep belief network based classification

Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Deep Learning Based Emotion Recognition from Chinese Speech

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

Emotion recognition of speech signal using Taylor series and deep belief network based classification

Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation