Log in

Small vocabulary isolated-word automatic speech recognition for single-word commands in Arabic spoken

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Research into automated speech recognition (ASR) for the Arabic language has been steadily increasing due to its potential for great growth. In this paper, we implemented Dynamic Time War** (DTW) and Vector Quantization (VQ) techniques to apply to limited vocabulary speech recognition applications. Our goal was to build a small vocabulary, speaker independent isolated-word recognition system with a higher success rate for recognizing numerals between 0 and 9 in Palestinian spoken Arabic. To do so, we enhanced the adopted methodology and improved its operator dependability. The algorithm obtained an accuracy rate of 99.6%. To achieve this, we will use the Mel Frequency Cepstral Coefficients (MFCC) algorithm to extract certain features from the speech, which will be used to reduce the dimensionality of the input voice. This algorithm will then be downloaded to a Digital Signal Processor (DSP) card, which will be responsible for recognizing the one-digit number and sending commands to the external world.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • Adnene N, Sabri B, Mohammed B (2021) Design and implementation of an automatic speech recognition based voice control system.

  • Al-Alaoui MA, Al-Kanj L, Azar J, Yaacoub E (2008) Speech recognition using artificial neural networks and hidden Markov models. J IEEE Multidiscip Eng Educ Magazine 3:77–86

    Google Scholar 

  • Baeza-Yates R, Ribeiro-Neto B (2000) Jont B Allen. How do humans process and recognize speech? IEEE Trans. Speech Audio Processing, 2(4): 657–577, https://doi.org/10.1109/89.326615. J Acustica 86: 117–128

  • Bhatt S, Jain A, Dev A (2021) Feature extraction techniques with analysis of confusing words for speech recognition in the Hindi language. J Wirel Personal Commun 118:3303–3333

    Article  Google Scholar 

  • Buzo A, Gray A, Gray RM, Markel J (1980) Speech coding based upon vector quantization. J IEEE Trans Acoust Speech Signal Process 28:562–574

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng Y, Ma C, Melnar L (2007) Voice-to-phoneme conversion algorithms for voice-tag applications in embedded platforms. J EURASIP J Audio Speech Music Process 2008:568737

    Google Scholar 

  • Darabkh KA, Khalifeh AF, Jafar IF, Bathech BA, Sabah SW (2013) A yet efficient communication system with hearing-impaired people based on isolated words of arabic language. J IAENG Int J Comput Sci 40:183–192

    Google Scholar 

  • Dhouib A, Othman A, El Ghoul O, Khribi MK, Al Sinani A (2022) Arabic automatic speech recognition: a systematic literature review. J Appl Sci 12:8898

    Article  Google Scholar 

  • Fendji JLKE, Tala DCM, Yenke BO, Atemkeng M (2022) Automatic speech recognition using limited vocabulary: a survey. J Appl Artif Intell 36:2095039

    Article  Google Scholar 

  • Galatang DH (2020) Syllable-based indonesian automatic speech recognition. J Int J Electr Eng Inf 12:720–728

    Google Scholar 

  • Gupta H, Gupta D (2016) LPC and LPCC method of feature extraction in speech recognition system. In: 2016 6th international conference-cloud system and big data engineering (confluence), 498–502. IEEE

  • Hill P (2018) Audio and speech processing with MATLAB. CRC Press, New York

    Book  Google Scholar 

  • Kedem B, Yakowitz S (1994) Time series analysis by higher order crossings. IEEE press, New York

    MATH  Google Scholar 

  • Korayem MH, Azargoshasb S, Korayem AH, Tabibian S (2021) Design and implementation of the voice command recognition and the sound source localization system for human–robot interaction. J Robotica 39:1779–1790

    Article  Google Scholar 

  • Krishnan M, Neophytou CP, Prescott G (1994). Wavelet transform speech recognition using vector quantization, dynamic time war** and artificial neural networks. J Center Excell Comput Aided Syst Eng, Telecommun Inf Sci Lab

  • Lipeika A, Lipeikienė J, Telksnys L (2002) Development of isolated word speech recognition system. J Informatica 13:37–46

    MATH  Google Scholar 

  • McLoughlin IV (2016) Speech and audio processing: a MATLAB-based approach. Cambridge University Press

    Book  Google Scholar 

  • Mitra SK, Kuo Y (2006) Digital signal processing: a computer-based approach. McGraw-Hill, New York

    Google Scholar 

  • Mousa A (2011) MareText independent speaker identification based on K-mean algorithm. J Int J Electr Eng Inform 3:100

    Google Scholar 

  • Nguyen QH, Cao TD (2020) A novel method for recognizing vietnamese voice commands on smartphones with support vector machine and convolutional neural networks. J Wirel Commun Mobile Comput 2020:1–9

    Google Scholar 

  • Obaid M, Bayram Z, Saleh M (2019) Instant secure mobile payment scheme. J IEEE Access 7:55669–55678

    Article  Google Scholar 

  • Oppenheim AV, Buck JR, Schafer RW (2001) Discrete-time signal processing., vol 2. Prentice Hall, Upper Saddle River, NJ

    Google Scholar 

  • Ouisaadane A, Said S (2021) A comparative study for Arabic speech recognition system in noisy environments. Int J Speech Technol 24:761–770

    Article  Google Scholar 

  • Pleshkova S, Zahari Z, Bekiarski A (2018) Development of speech recognition algorithm and labview model for voice command control of mobille robot motio." In 2018 international conference on high technology for sustainable development (HiTech), 1–4. IEEE

  • Resende, FGV, Netto SL (2000) Subband stationarity analysis of speech signals. In: 2000 IEEE international symposium on circuits and systems (ISCAS), 714–17. IEEE

  • Shareef, SR, Irhayim YF (2021) A review: isolated Arabic words recognition using artificial intelligent techniques. In Journal of Physics: conference Series, 012026. IOP Publishing

Download references

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahmoud Obaid.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Obaid, M., Hodrob, R., Abu Mwais, A. et al. Small vocabulary isolated-word automatic speech recognition for single-word commands in Arabic spoken. Soft Comput (2023). https://doi.org/10.1007/s00500-023-07959-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00500-023-07959-7

Keyword

Navigation