Improving Pathological Voice Detection: A Weakly Supervised Learning Method

Wei, Weixing; Wen, Liang; Qian, Jiale; Shan, Yufei; Wang, Jun; Li, Wei

doi:10.1007/978-981-19-4703-2_9

Weixing Wei⁴¹,
Liang Wen⁴³,
Jiale Qian⁴¹,
Yufei Shan⁴³,
Jun Wang⁴³ &
…
Wei Li^41,42

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 923))

200 Accesses

Abstract

Deep learning methods are data-driven. But for pathological voice detection, it is difficult to obtain high-quality labeled data. In this work, a weakly supervised learning Method is presented to improve the quality of existing datasets by learning sample weights and fine-grained labels. First, A convolutional neural network (CNN) is devised as the basic architecture to detect the pathological voice. Then, a proposed self-training algorithm is used to iteratively run and automatically learn the sample weights and fine-grained labels. These learned sample weights and fine-grained labels are used to train the CNN model from scratch. The experiment results on the Saarbrucken Voice database show that the diagnosis accuracy improved from 75.7 to 82.5%, with a 6.8% improvement in accuracy over the CNN models trained with the original dataset. This work demonstrates that the weakly supervised learning method can significantly improve the classification performance to distinguish pathological voice and healthy voice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (Canada)

eBook: USD 129.00; Price excludes VAT (Canada)

Softcover Book: USD 169.99; Price excludes VAT (Canada)

Hardcover Book: USD 169.99; Price excludes VAT (Canada)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Lecture Notes in Computer Science: Pathological Voice Recognition Based on Acoustic Phonatory Features

Towards robust voice pathology detection

Article 04 April 2018

Voice disorder classification using convolutional neural network based on deep transfer learning

Article Open access 04 May 2023

References

Stemple JC, Roy N, Klaben BK (2018) Clinical voice pathology: theory and management. Plural Publishing, San Diego
Google Scholar
Dejonckere PH, Bradley P, Clemente P et al (2001) A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Eur Arch Oto-rhino-laryngology 258(2):77–82
Article Google Scholar
Mekyska J, Janousova E, Gomez-Vilda P et al (2015) Robust and complex approach of pathological speech signal analysis. Neurocomputing 167:94–111
Article Google Scholar
Rabiner L (1993) Fundamentals of speech recognition
Google Scholar
Al-Nasheri A, Muhammad G, Alsulaiman M et al (2017) An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J Voice 31(1):113-e9
Article Google Scholar
Henríquez P, Alonso JB, Ferrer MA et al (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process 17(6):1186–1195
Article Google Scholar
Muhammad G, Melhem M (2014) Pathological voice detection and binary classification using MPEG-7 audio features. Biomed Sig Process Control 11:1–9
Article Google Scholar
Panek D, Skalski A, Gajda J (2014) Quantification of linear and non-linear acoustic analysis applied to voice pathology detection. In: Piȩtka E, Kawa J, Wieclawek W (eds) Information Technologies in Biomedicine, Volume 4. AISC, vol 284, pp 355–364. Springer, Cham. https://doi.org/10.1007/978-3-319-06596-0_33
Hegde S, Shetty S, Rai S et al (2019) A survey on machine learning approaches for automatic detection of voice disorders. J Voice 33(6):947-e11
Article Google Scholar
Cordeiro H, Fonseca J, Guimarães I et al (2017) Hierarchical classification and system combination for automatically identifying physiological and neuromuscular laryngeal pathologies. J Voice 31(3):384-e9
Article Google Scholar
Hemmerling D (2017) Voice pathology distinction using auto associative neural networks. In: 2017 25th European signal processing conference (EUSIPCO), pp 1844–1847. IEEE
Google Scholar
Wu H, Soraghan J, Lowit A, et al (2018) A deep learning method for pathological voice detection using convolutional deep belief networks. In: Proceedings Interspeech 2018, pp 446–450. http://dx.doi.org/10.21437/Interspeech.2018-1351, https://doi.org/10.21437/Interspeech.2018-1351
Chen L, Chen J (2020) Deep neural network for automatic classification of pathological voice signals. J Voice S0892–1997
Google Scholar
Mesallam TA, Farahat M, Malki KH et al (2017) Development of the Arabic voice pathology database and its evaluation by using speech features and machine learning algorithms. J Healthcare Eng 2017:1–13
Article Google Scholar
Harar P, Galaz Z, Alonso-Hernandez JB et al (2020) Towards robust voice pathology detection. Neural Comput Appl 32(20):15747–15757
Article Google Scholar
Zhou Z (2018) A brief introduction to weakly supervised learning. Nat Sci Rev 1:1
MathSciNet Google Scholar
Jiang Y, Zhang X, Deng J, et al (2019) Data augmentation based convolutional neural network for auscultation. J Fudan Univ (Natural Sci) 328–333
Google Scholar
Woldert-Jokisz B (2007) Saarbruecken voice database. http://stimmdb.coli.uni-saarland.de/
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. ICLR (Poster)
Google Scholar
Harar P, Alonso-Hernandezy JB, Mekyska J et al (2017) Voice pathology detection using deep learning: a preliminary study. In: International conference and workshop on bioinspired intelligence (IWOBI), pp 1–4. https://doi.org/10.1109/IWOBI.2017.7985525

Download references

Acknoledgements

This work was supported by National Key R &D Program of China (2019YFC1711800), NSFC (62171138).

Author information

Authors and Affiliations

School of Computer Science and Technology, Fudan University, Shanghai, 200438, China
Weixing Wei, Jiale Qian & Wei Li
Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, 200433, China
Wei Li
CETHIK Group Co., Ltd., Hangzhou, 311100, China
Liang Wen, Yufei Shan & Jun Wang

Authors

Weixing Wei
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wen
View author publications
You can also search for this author in PubMed Google Scholar
Jiale Qian
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Shan
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Li .

Editor information

Editors and Affiliations

Nan**g University of Posts and Telecommunications, Nan**g, Jiangsu, China
** Shao
Bei**g Institute of Technology, Bei**g, China
Kun Qian
Communication University of China, Bei**g, China
**n Wang
Zhejiang University, Hangzhou, Zhejiang, China
Kejun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, W., Wen, L., Qian, J., Shan, Y., Wang, J., Li, W. (2023). Improving Pathological Voice Detection: A Weakly Supervised Learning Method. In: Shao, X., Qian, K., Wang, X., Zhang, K. (eds) Proceedings of the 9th Conference on Sound and Music Technology. Lecture Notes in Electrical Engineering, vol 923. Springer, Singapore. https://doi.org/10.1007/978-981-19-4703-2_9

Download citation

DOI: https://doi.org/10.1007/978-981-19-4703-2_9
Published: 01 September 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-4702-5
Online ISBN: 978-981-19-4703-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Improving Pathological Voice Detection: A Weakly Supervised Learning Method

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Lecture Notes in Computer Science: Pathological Voice Recognition Based on Acoustic Phonatory Features

Towards robust voice pathology detection

Voice disorder classification using convolutional neural network based on deep transfer learning

References

Acknoledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving Pathological Voice Detection: A Weakly Supervised Learning Method

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Lecture Notes in Computer Science: Pathological Voice Recognition Based on Acoustic Phonatory Features

Towards robust voice pathology detection

Voice disorder classification using convolutional neural network based on deep transfer learning

References

Acknoledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation