Develo** a Speech Recognition Service for Korean Speakers with Dysarthria

  • Conference paper
  • First Online:
Advances in Computer Science and Ubiquitous Computing (CUTECSA 2022)

Abstract

The purpose of this study is to provide an API for converting the speech of people with dysarthria into a text form by constructing a model that learns the speech characteristics of Korean speakers with dysarthria. A speech recognition model was constructed by using the Korean speech recognition open-source toolkit (Kospeech) which embodies DeepSpeech2 model. Ten thousand voice files which recorded the speech of people with dysarthria and the corresponding transcription files were also collected. Both files were augmented and used for model training. By performing the WER/CER evaluation, it was confirmed that the model constructed in this study recognized the speech of the Korean speakers with dysarthria better than the existing speech recognition model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (Canada)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (Canada)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Biadsy F, Weiss RJ, Moreno PJ, Kanevsky D, Jia Y (2019) Parrotron: an end-to-end speech-to-speech conversion model and its applications to hearing-impaired speech and speech separation. ar**v

    Google Scholar 

  2. Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Zhu Z (2016) Deep speech 2: end-to-end speech recognition in english and mandarin. In: International conference on machine learning, PMLR, pp 173–182

    Google Scholar 

  3. Min S, Lee K, Lee D, Ryu D (2020) A study on quantitative evaluation method for STT engine accuracy based on Korean characteristics. J Korea Acad-Indust cooperation Soc 21(7):699–707

    Google Scholar 

  4. Jefferson M (2019) Usability of automatic speech recognition systems for individuals with speech disorders: past, present, future, and a proposed model. Retrieved from the University of Minnesota Digital Conservancy, pp 11–13

    Google Scholar 

Download references

Acknowledgements

This research was supported by Hankuk University of Foreign Studies Research Fund (of 2020).

This research was supported by the MIST (Ministry of Science, ICT), Korea, under the National Program for Excellence in SW), supervised by the IITP (Institute of Information and communications Technology Planning and Evaluation) in 2022” (2019-0-01816).

Funding

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2021S1A5A8065934).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Se Jong Oh or Ill Chul Doo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Choi, G.W., Kim, M.H., Park, H.Y., Choi, Y.H., Oh, S.J., Doo, I.C. (2023). Develo** a Speech Recognition Service for Korean Speakers with Dysarthria. In: Park, J.S., Yang, L.T., Pan, Y., Park, J.H. (eds) Advances in Computer Science and Ubiquitous Computing. CUTECSA 2022. Lecture Notes in Electrical Engineering, vol 1028. Springer, Singapore. https://doi.org/10.1007/978-981-99-1252-0_34

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1252-0_34

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1251-3

  • Online ISBN: 978-981-99-1252-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation