Log in

Improvements in ASR system to access the real-time agricultural commodity prices and weather information in Kannada language/dialects

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This research work showcases advancements in an isolated Kannada automatic speech recognition (ASR) system designed for accessing agricultural commodity prices and weather information in uncontrolled environments. The system includes an interactive voice response system (IVRS), models of ASR, and databases of weather and agricultural commodity prices information. However, the previous system suffered from reduced accuracy due to the presence of various background noises during offline and online speech recognition. To address this issue, the proposed system includes a background noise reduction module that is introduced before the part of speech feature extraction. The investigation results indicate that the proposed noise reduction algorithm outperforms traditional signal processing algorithms, resulting in no audibility of musical and other background noises in the enhanced NOIZEUS speech corpora and isolated Kannada speech data. The use of this noise suppression algorithm and time delay neural network (TDNN) ASR modeling technique in the system results in a 1.1% improvement in speech recognition accuracy compared to the previous deep neural network - hidden Markov model (DNN-HMM) based system. The enhanced isolated Kannada system was tested online by 500 speakers/users for accessing real-time agricultural commodity prices and weather information in Kannada language/dialects under corrupted environments. The algorithms source code and ASR models are made publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Al-Anzi FS (2022) Improved noise-resilient isolated words speech recognition using piecewise differentiation. Fractals 30(08):2240227

    Article  Google Scholar 

  2. Bhable S, Lahase A, Maher S (2021) Automatic speech recognition (ASR) of isolated words in Hindi low resource Language. Int J Res Appl Sci Eng Technol 9(2):260–265

    Article  Google Scholar 

  3. Boumehdi A, Yousfi A (2020) Construction of a database for speech recognition of isolated Arabic words. In: Proc Int conf on intelligent systems: theories and applications, pp 1–4

  4. Ephraim Y, Malah D (1984) Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 32(6):1109–1121

    Article  Google Scholar 

  5. Hu YJSC (2007) Subjective evaluation and comparison of speech enhancement algorithms. Speech Comm 49:588–601

    Article  Google Scholar 

  6. Hu Y, Loizou PC (2006) Evaluation of objective measures for speech enhancement. In: Ninth international conference on spoken language processing

  7. Hu Y, Loizou PC (2007) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Lang Process 16(1):229–238

    Article  Google Scholar 

  8. Kumar PS, Yadava TG, Jayanna HS (2019) Continuous Kannada speech recognition system under degraded condition. Circuits, Systems and Signal Processing, Springer 39(1):391–419

    Article  Google Scholar 

  9. Kumar A, Solanki SS, Chandra M (2022) Effect of background Indian music on performance of speech recognition models for Hindi databases. International Journal of Speech Technology, 1–12

  10. Liu F, Zhao L, Cheng X, Dai Q, Shi X, Qiao J (2020) Fine-grained action recognition by motion saliency and mid-level patches. Appl Sci 10 (8):2811

    Article  Google Scholar 

  11. Loizou PC (2005) Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Trans Speech Audio Process 13(5):857–869

    Article  Google Scholar 

  12. Lu Y, Loizou PC (2008) A geometric approach to spectral subtraction. Speech Commun 50(6):453–466

    Article  Google Scholar 

  13. Maruf MR, Faruque MO, Mahmood S, Nelima NN, Muhtasim MG, Pervez MJA (2020) Effects of noise on RASTA-PLP and MFCC based Bangla ASR using CNN. IEEE Region 10 Symposium, 1564–1567

  14. Nagaraja BG, Jayanna HS (2013) Kannada language parameters for speaker identification with the constraint of limited data. Int J Image Graph Signal Process 5(9):14

    Article  Google Scholar 

  15. Nagaraja BG, Jayanna HS (2013) Combination of features for crosslingual speaker identification with the constraint of limited data. Inproceedings of the Fourth International Conference on Signal and Image Processing 1:143–148

    Google Scholar 

  16. Papoulis A, Unnikrishna SP (2002) Probability random variables and stochastic processes

  17. Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J (2011) The Kaldi speech recognition toolkit. IEEE Workshop on Automatic Speech Recognition and Understanding, IEEE Signal Processing Society

    Google Scholar 

  18. Recommendation IT (2001) Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. Rec ITU-T 862

  19. Shahnawazuddin S, Deepak KT, Sarma BD, Deka A, Prasanna SRM, Sinha R (2015) Low complexity on-line adaptation techniques in context of Assamese spoken query system. J Signal Process Syst 81:83–97

    Article  Google Scholar 

  20. Shahnawazuddin S, Deepak KT, Dey A, Siddika I, Prasanna SRM, Sinha R (2017) Improvements in IITG Assamese spoken query system: background noise suppression and alternate acoustic modeling. J Signal Process Syst 88:91–102

    Article  Google Scholar 

  21. Shareef SR, Irhayim YF (2021) A review: isolated Arabic words recognition using artificial intelligent techniques. J Phys: Conf Ser 1897(1):012–026

    Google Scholar 

  22. Slivova M, Voznak M, Tovarek J, Partila P (2022) Detection of speaker liveness with CNN isolated word ASR for verification systems. Multimed Tools Appl 81(7):9445–9457

    Article  Google Scholar 

  23. Tejedor-García C, Cardeñoso-Payo V, Escudero-Mancebo D (2021) Automatic speech recognition (ASR) systems applied to pronunciation assessment of L2 Spanish for Japanese speakers. Appl Sci 11(15):6695

    Article  Google Scholar 

  24. Yadava TG, Jayanna HS (2020) Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling. Int J Speech Technol 23:149–167

    Article  Google Scholar 

  25. Yoma NB, McInnes FR, Jack MA (1998) Improving performance of spectral subtraction in speech recognition using a model for additive noise. IEEE Trans Speech Audio Process 6(6):579–582

    Article  Google Scholar 

  26. Zada B, Ullah R (2020) Pashto isolated digits recognition using deep convolutional neural network. Heliyon 6(2):e03372

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thimmaraja Yadava G.

Ethics declarations

Conflict of Interests

The authors have no conflict of interests on the manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

https://sites.google.com/view/thimmarajayadavag/downloads.

Nagaraja B G and Jayanna H S contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

G, T.Y., G, N.B. & S, J.H. Improvements in ASR system to access the real-time agricultural commodity prices and weather information in Kannada language/dialects. Multimed Tools Appl 83, 4195–4217 (2024). https://doi.org/10.1007/s11042-023-15350-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15350-9

Keywords

Navigation