Design of Neural Network Model for Emotional Speech Recognition

  • Conference paper
  • First Online:
Artificial Intelligence and Evolutionary Algorithms in Engineering Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 325))

Abstract

Human–computer interaction (HCI) needs to be improved for the field of recognition and detection. Exclusively, the emotion recognition has major impact on social, engineering, and medical science applications. This paper presents an approach for emotion recognition of emotional speech based on neural network. Linear predictive coefficients and radial basis function network are used as features and classification techniques, respectively, for emotion recognition. Results reveal that the approach is effective in recognition of human speech emotions. Speech utterances are directly extracted from audio channel including background noise. Totally, 75 utterances from 05 speakers were collected based on five emotion categories. Fifteen utterances have been considered for training and rest are for test. The proposed approach has been tested and verified for newly developed dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 160.49
Price includes VAT (Thailand)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 199.99
Price excludes VAT (Thailand)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)

    Google Scholar 

  2. D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)

    Article  Google Scholar 

  3. N. Fragopanagos, G. Taylor, Emotional speech recognition: resources, features, and methods. Neural Networks 18, 389–405 (2005)

    Article  Google Scholar 

  4. F. Eyben et al., On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)

    Article  Google Scholar 

  5. T. Polzehl, A. Schmitt, F. Metze, M. Wagner, Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9–10), 1198–1209 (2011)

    Article  Google Scholar 

  6. F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech, in ICSLP (1996), pp. 1970–1973

    Google Scholar 

  7. B.S. Atal, Automatic recognition of speakers from their voices. IEEE 64(4), 460–476 (1976)

    Article  Google Scholar 

  8. M.M. Javidi, F. Roshan, Speech emotion recognition by using combinations of C5.0, neural network (NN), and support vector machines (SVM) classification methods. J. Math. Comput. Sci. 6, 191–200 (2013)

    Google Scholar 

  9. M.N. Mohanty, B. Jena, Analysis of stressed human speech. Int. J. Comput. Vision Robot. 2(2), 180–187 (2011)

    Google Scholar 

  10. M.N. Mohanty, A. Routray, P. Kabisatpathy, Voice detection using statistical method. Int. J. Eng. Techsci. 2(1), 120–124 (2010)

    Google Scholar 

  11. J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561–580 (1975)

    Article  Google Scholar 

  12. B.S. Atal, S.L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)

    Article  Google Scholar 

  13. T.F. Quatieri, Discrete-Time Speech Signal Processing, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996)

    Google Scholar 

  14. A. Samal, D. Parida, M.R. Satpathy, M.N. Mohanty, On the use of MFCC feature vectors clustering for efficient text dependent speaker recognition, in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Application (FICTA)-2013, vol. 247 (2013), pp. 305–312

    Google Scholar 

  15. S. Haykins, Neural Networks (Prentice-Hall, Upper Saddle River, 1999)

    Google Scholar 

  16. J.H.L. Hansen, B.D. Womack, Feature analysis and neural network based classification of speech under stress. IEEE Trans. Speech Audio Process. 4, 307–313 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. K. Palo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Palo, H.K., Mohanty, M.N., Chandra, M. (2015). Design of Neural Network Model for Emotional Speech Recognition. In: Suresh, L., Dash, S., Panigrahi, B. (eds) Artificial Intelligence and Evolutionary Algorithms in Engineering Systems. Advances in Intelligent Systems and Computing, vol 325. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2135-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2135-7_32

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2134-0

  • Online ISBN: 978-81-322-2135-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation