Abstract
The purpose of this study is to provide an API for converting the speech of people with dysarthria into a text form by constructing a model that learns the speech characteristics of Korean speakers with dysarthria. A speech recognition model was constructed by using the Korean speech recognition open-source toolkit (Kospeech) which embodies DeepSpeech2 model. Ten thousand voice files which recorded the speech of people with dysarthria and the corresponding transcription files were also collected. Both files were augmented and used for model training. By performing the WER/CER evaluation, it was confirmed that the model constructed in this study recognized the speech of the Korean speakers with dysarthria better than the existing speech recognition model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Biadsy F, Weiss RJ, Moreno PJ, Kanevsky D, Jia Y (2019) Parrotron: an end-to-end speech-to-speech conversion model and its applications to hearing-impaired speech and speech separation. ar**v
Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Zhu Z (2016) Deep speech 2: end-to-end speech recognition in english and mandarin. In: International conference on machine learning, PMLR, pp 173–182
Min S, Lee K, Lee D, Ryu D (2020) A study on quantitative evaluation method for STT engine accuracy based on Korean characteristics. J Korea Acad-Indust cooperation Soc 21(7):699–707
Jefferson M (2019) Usability of automatic speech recognition systems for individuals with speech disorders: past, present, future, and a proposed model. Retrieved from the University of Minnesota Digital Conservancy, pp 11–13
Acknowledgements
This research was supported by Hankuk University of Foreign Studies Research Fund (of 2020).
This research was supported by the MIST (Ministry of Science, ICT), Korea, under the National Program for Excellence in SW), supervised by the IITP (Institute of Information and communications Technology Planning and Evaluation) in 2022” (2019-0-01816).
Funding
This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2021S1A5A8065934).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Choi, G.W., Kim, M.H., Park, H.Y., Choi, Y.H., Oh, S.J., Doo, I.C. (2023). Develo** a Speech Recognition Service for Korean Speakers with Dysarthria. In: Park, J.S., Yang, L.T., Pan, Y., Park, J.H. (eds) Advances in Computer Science and Ubiquitous Computing. CUTECSA 2022. Lecture Notes in Electrical Engineering, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-99-1252-0_34
Download citation
DOI: https://doi.org/10.1007/978-981-99-1252-0_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1251-3
Online ISBN: 978-981-99-1252-0
eBook Packages: EngineeringEngineering (R0)