Abstract
Research into automated speech recognition (ASR) for the Arabic language has been steadily increasing due to its potential for great growth. In this paper, we implemented Dynamic Time War** (DTW) and Vector Quantization (VQ) techniques to apply to limited vocabulary speech recognition applications. Our goal was to build a small vocabulary, speaker independent isolated-word recognition system with a higher success rate for recognizing numerals between 0 and 9 in Palestinian spoken Arabic. To do so, we enhanced the adopted methodology and improved its operator dependability. The algorithm obtained an accuracy rate of 99.6%. To achieve this, we will use the Mel Frequency Cepstral Coefficients (MFCC) algorithm to extract certain features from the speech, which will be used to reduce the dimensionality of the input voice. This algorithm will then be downloaded to a Digital Signal Processor (DSP) card, which will be responsible for recognizing the one-digit number and sending commands to the external world.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-023-07959-7/MediaObjects/500_2023_7959_Fig13_HTML.png)
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Adnene N, Sabri B, Mohammed B (2021) Design and implementation of an automatic speech recognition based voice control system.
Al-Alaoui MA, Al-Kanj L, Azar J, Yaacoub E (2008) Speech recognition using artificial neural networks and hidden Markov models. J IEEE Multidiscip Eng Educ Magazine 3:77–86
Baeza-Yates R, Ribeiro-Neto B (2000) Jont B Allen. How do humans process and recognize speech? IEEE Trans. Speech Audio Processing, 2(4): 657–577, https://doi.org/10.1109/89.326615. J Acustica 86: 117–128
Bhatt S, Jain A, Dev A (2021) Feature extraction techniques with analysis of confusing words for speech recognition in the Hindi language. J Wirel Personal Commun 118:3303–3333
Buzo A, Gray A, Gray RM, Markel J (1980) Speech coding based upon vector quantization. J IEEE Trans Acoust Speech Signal Process 28:562–574
Cheng Y, Ma C, Melnar L (2007) Voice-to-phoneme conversion algorithms for voice-tag applications in embedded platforms. J EURASIP J Audio Speech Music Process 2008:568737
Darabkh KA, Khalifeh AF, Jafar IF, Bathech BA, Sabah SW (2013) A yet efficient communication system with hearing-impaired people based on isolated words of arabic language. J IAENG Int J Comput Sci 40:183–192
Dhouib A, Othman A, El Ghoul O, Khribi MK, Al Sinani A (2022) Arabic automatic speech recognition: a systematic literature review. J Appl Sci 12:8898
Fendji JLKE, Tala DCM, Yenke BO, Atemkeng M (2022) Automatic speech recognition using limited vocabulary: a survey. J Appl Artif Intell 36:2095039
Galatang DH (2020) Syllable-based indonesian automatic speech recognition. J Int J Electr Eng Inf 12:720–728
Gupta H, Gupta D (2016) LPC and LPCC method of feature extraction in speech recognition system. In: 2016 6th international conference-cloud system and big data engineering (confluence), 498–502. IEEE
Hill P (2018) Audio and speech processing with MATLAB. CRC Press, New York
Kedem B, Yakowitz S (1994) Time series analysis by higher order crossings. IEEE press, New York
Korayem MH, Azargoshasb S, Korayem AH, Tabibian S (2021) Design and implementation of the voice command recognition and the sound source localization system for human–robot interaction. J Robotica 39:1779–1790
Krishnan M, Neophytou CP, Prescott G (1994). Wavelet transform speech recognition using vector quantization, dynamic time war** and artificial neural networks. J Center Excell Comput Aided Syst Eng, Telecommun Inf Sci Lab
Lipeika A, Lipeikienė J, Telksnys L (2002) Development of isolated word speech recognition system. J Informatica 13:37–46
McLoughlin IV (2016) Speech and audio processing: a MATLAB-based approach. Cambridge University Press
Mitra SK, Kuo Y (2006) Digital signal processing: a computer-based approach. McGraw-Hill, New York
Mousa A (2011) MareText independent speaker identification based on K-mean algorithm. J Int J Electr Eng Inform 3:100
Nguyen QH, Cao TD (2020) A novel method for recognizing vietnamese voice commands on smartphones with support vector machine and convolutional neural networks. J Wirel Commun Mobile Comput 2020:1–9
Obaid M, Bayram Z, Saleh M (2019) Instant secure mobile payment scheme. J IEEE Access 7:55669–55678
Oppenheim AV, Buck JR, Schafer RW (2001) Discrete-time signal processing., vol 2. Prentice Hall, Upper Saddle River, NJ
Ouisaadane A, Said S (2021) A comparative study for Arabic speech recognition system in noisy environments. Int J Speech Technol 24:761–770
Pleshkova S, Zahari Z, Bekiarski A (2018) Development of speech recognition algorithm and labview model for voice command control of mobille robot motio." In 2018 international conference on high technology for sustainable development (HiTech), 1–4. IEEE
Resende, FGV, Netto SL (2000) Subband stationarity analysis of speech signals. In: 2000 IEEE international symposium on circuits and systems (ISCAS), 714–17. IEEE
Shareef, SR, Irhayim YF (2021) A review: isolated Arabic words recognition using artificial intelligent techniques. In Journal of Physics: conference Series, 012026. IOP Publishing
Funding
The authors received no specific funding for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Obaid, M., Hodrob, R., Abu Mwais, A. et al. Small vocabulary isolated-word automatic speech recognition for single-word commands in Arabic spoken. Soft Comput (2023). https://doi.org/10.1007/s00500-023-07959-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s00500-023-07959-7