Develo** a Speech Recognition Service for Korean Speakers with Dysarthria

Choi, Go Woon; Kim, Min Hyuk; Park, Ha Young; Choi, Young Hae; Oh, Se Jong; Doo, Ill Chul

doi:10.1007/978-981-99-1252-0_34

Go Woon Choi³⁹,
Min Hyuk Kim⁴⁰,
Ha Young Park⁴¹,
Young Hae Choi⁴²,
Se Jong Oh⁴³ &
…
Ill Chul Doo⁴³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1028))

Included in the following conference series:

International Conference on Computer Science and its Applications and the International Conference on Ubiquitous Information Technologies and Applications

452 Accesses

Abstract

The purpose of this study is to provide an API for converting the speech of people with dysarthria into a text form by constructing a model that learns the speech characteristics of Korean speakers with dysarthria. A speech recognition model was constructed by using the Korean speech recognition open-source toolkit (Kospeech) which embodies DeepSpeech2 model. Ten thousand voice files which recorded the speech of people with dysarthria and the corresponding transcription files were also collected. Both files were augmented and used for model training. By performing the WER/CER evaluation, it was confirmed that the model constructed in this study recognized the speech of the Korean speakers with dysarthria better than the existing speech recognition model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (Canada)

eBook: USD 219.00; Price excludes VAT (Canada)

Softcover Book: USD 279.99; Price excludes VAT (Canada)

Hardcover Book: USD 279.99; Price excludes VAT (Canada)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition

Article 20 March 2023

Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

References

Biadsy F, Weiss RJ, Moreno PJ, Kanevsky D, Jia Y (2019) Parrotron: an end-to-end speech-to-speech conversion model and its applications to hearing-impaired speech and speech separation. ar**v
Google Scholar
Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Zhu Z (2016) Deep speech 2: end-to-end speech recognition in english and mandarin. In: International conference on machine learning, PMLR, pp 173–182
Google Scholar
Min S, Lee K, Lee D, Ryu D (2020) A study on quantitative evaluation method for STT engine accuracy based on Korean characteristics. J Korea Acad-Indust cooperation Soc 21(7):699–707
Google Scholar
Jefferson M (2019) Usability of automatic speech recognition systems for individuals with speech disorders: past, present, future, and a proposed model. Retrieved from the University of Minnesota Digital Conservancy, pp 11–13
Google Scholar

Download references

Acknowledgements

This research was supported by Hankuk University of Foreign Studies Research Fund (of 2020).

This research was supported by the MIST (Ministry of Science, ICT), Korea, under the National Program for Excellence in SW), supervised by the IITP (Institute of Information and communications Technology Planning and Evaluation) in 2022” (2019-0-01816).

Funding

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2021S1A5A8065934).

Author information

Authors and Affiliations

Department of Thai Studies, Hankuk University of Foreign Studies, Seoul, Korea
Go Woon Choi
Department of English Linguistic and Language Technology, Hankuk University of Foreign Studies, Seoul, Korea
Min Hyuk Kim
Department of Big Data Science, Korea University, Sejong, Korea
Ha Young Park
Linguistics and Cognitive Science, Hankuk University of Foreign Studies, Yongin, Korea
Young Hae Choi
Artificial Intelligence Education, Hankuk University of Foreign Studies, Yongin, Korea
Se Jong Oh & Ill Chul Doo

Authors

Go Woon Choi
View author publications
You can also search for this author in PubMed Google Scholar
Min Hyuk Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ha Young Park
View author publications
You can also search for this author in PubMed Google Scholar
Young Hae Choi
View author publications
You can also search for this author in PubMed Google Scholar
Se Jong Oh
View author publications
You can also search for this author in PubMed Google Scholar
Ill Chul Doo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Se Jong Oh or Ill Chul Doo .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jeonju University, Jeonju-si, Korea (Republic of)
Ji Su Park
Department of Computer Science, St. Francis Xavier University, Antigonish, NS, Canada
Laurence T. Yang
Department of Computer Science, Georgia State University, Atlanta, USA
Yi Pan
Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
Jong Hyuk Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choi, G.W., Kim, M.H., Park, H.Y., Choi, Y.H., Oh, S.J., Doo, I.C. (2023). Develo** a Speech Recognition Service for Korean Speakers with Dysarthria. In: Park, J.S., Yang, L.T., Pan, Y., Park, J.H. (eds) Advances in Computer Science and Ubiquitous Computing. CUTECSA 2022. Lecture Notes in Electrical Engineering, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-99-1252-0_34

Download citation

DOI: https://doi.org/10.1007/978-981-99-1252-0_34
Published: 03 June 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1251-3
Online ISBN: 978-981-99-1252-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Develo** a Speech Recognition Service for Korean Speakers with Dysarthria

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition

Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Develo** a Speech Recognition Service for Korean Speakers with Dysarthria

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition

Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition

Machine Learning Based Assistive Speech Technology for People with Neurological Disorders

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation