We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.
Filters applied:

Search Results

Showing 1-20 of 35 results
  1. Speech recognition model design for Sundanese language using WAV2VEC 2.0

    Indonesia has a variety of languages, one of which is Sundanese. Sundanese is a regional language from Indonesia that has the potential to become...

    Albert Cryssiover, Amalia Zahra in International Journal of Speech Technology
    Article 14 March 2024
  2. Wav2vec-AD: Acoustic Unit Discovery Module-Integrated, Self-Supervised Contrastive Pre-training Approach for Speech Recognition

    An effective speech recognition model necessitates an ample supply of labeled data for supervised training. However, this proposition poses a...

    Yolwas Nurmemet, Lixu Sun, ... Zhixiang Wang in Journal of Shanghai Jiaotong University (Science)
    Article 10 May 2024
  3. Improving Automatic Speech Recognition for Non-native English with Transfer Learning and Language Model Decoding

    ASR systems designed for native English (L1) usually underperform on non-native English (L2). To address this performance gap, (1) we extend our...
    Peter Sullivan, Toshiko Shibano, Muhammad Abdul-Mageed in Analysis and Application of Natural Language and Speech Processing
    Chapter 2023
  4. Self-supervised Learning for Speech Emotion Recognition Task Using Audio-visual Features and Distil Hubert Model on BAVED and RAVDESS Databases

    Existing pre-trained models like Distil HuBERT excel at uncovering hidden patterns and facilitating accurate recognition across diverse data types,...

    Karim Dabbabi, Abdelkarim Mars in Journal of Systems Science and Systems Engineering
    Article 29 May 2024
  5. Speech Emotion Recognition Using Global-Aware Cross-Modal Feature Fusion Network

    Speech emotion recognition (SER) facilitates better interpersonal communication. Emotion is normally present in conversation in many forms, such as...
    Conference paper 2023
  6. Decoding speech perception from non-invasive brain recordings

    Decoding speech from brain activity is a long-awaited goal in both healthcare and neuroscience. Invasive devices have recently led to major...

    Alexandre Défossez, Charlotte Caucheteux, ... Jean-Rémi King in Nature Machine Intelligence
    Article Open access 05 October 2023
  7. Automatic Speech Recognition of Finnish-Swedish Dialects: A Comparison of Three Cutting-Edge Technologies

    This paper explores the performance of two different automatic speech recognition models for the Finnish-Swedish language. The first model, Whisper...
    Leonardo Espinosa-Leal, Kristoffer Kuvaja Adolfsson, Andrey Shcherbakov in Smart Technologies for a Sustainable Future
    Conference paper 2024
  8. Multimodal Recommendation Engine for Advertising Using Object Detection and Natural Language Processing

    In today's world, there is an explosion in online advertising due to high levels of activity of users online. With this comes the intrinsic issue of...
    S. Rajarajeswari, Manas P. Shankar, ... Manish Manohar in Advances in Data-driven Computing and Intelligent Systems
    Conference paper 2023
  9. Quality Assurance for Speech Synthesis with ASR

    Autoregressive TTS models are still widely used. Due to their stochastic nature, the output may vary from very good to completely unusable from one...
    René Peinl, Johannes Wirth in Intelligent Systems and Applications
    Conference paper 2023
  10. End-to-end ASR framework for Indian-English accent: using speech CNN-based segmentation

    The superiority of Automatic Speech Recognition (ASR) has significantly enhanced over time, with a focus from short utterance circumstances to longer...

    Ghayas Ahmed, Aadil Ahmad Lawaye in International Journal of Speech Technology
    Article 11 November 2023
  11. Adaptive Keyword Extraction Service for Turkish

    Keyword extraction is one of the important tasks of NLP. In this study, we have implemented a fast and competitive keyword extraction model for...
    H. Yavuz Erzurumlu, Yusuf Sinan Akgul in Intelligent Systems and Applications
    Conference paper 2023
  12. Exploration of Whisper fine-tuning strategies for low-resource ASR

    Limited data availability remains a significant challenge for Whisper’s low-resource speech recognition performance, falling short of practical...

    Yunpeng Liu, Xukui Yang, Dan Qu in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 01 June 2024
  13. English Pronunciation Correction Service for Hearing-Impaired People: BETTer, Focusing on the Personalized Speech Model

    There has been a growing body of research that explores the need for hearing impaired people. However, English Education aids for the...
    Seon Hong Park, Hyun ** Park, ... Ill Chul Doo in Advances in Computer Science and Ubiquitous Computing
    Conference paper 2023
  14. A Novel Approach to Video Summarization Using AI-GPT and Speech Recognition

    In an era where online video data is exploding, there is a growing need for efficient ways to summarize video content. In this paper, a novel...
    B. P. Aniruddha Prabhu, Tushar Sharma, ... M. S. Guru Prasad in Data Science and Applications
    Conference paper 2024
  15. Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition

    Representation learning or pre-training has shown promising performance for low-resource speech recognition which suffers from the data shortage....

    Zi-Qiang Zhang, Yan Song, ... Li-Rong Dai in Circuits, Systems, and Signal Processing
    Article 23 July 2022
  16. ASR Bundestag: A Large-Scale Political Debate Dataset in German

    We present ASR Bundestag, a dataset for automatic speech recognition in German, consisting of 610 h of aligned audio-transcript pairs for supervised...
    Johannes Wirth, René Peinl in Intelligent Systems and Applications
    Conference paper 2024
  17. Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

    Speech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech...

    Huda Barakat, Oytun Turk, Cenk Demiroglu in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 12 February 2024
  18. A survey of technologies for automatic Dysarthric speech recognition

    Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is...

    Zhaopeng Qian, Ke**g **ao, Chongchong Yu in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 11 November 2023
  19. Using Pre-trained Models for Code-Switched Speech Recognition

    In various regions of the world, people tend to use a mix of multiple languages in day-to-day communication. The multilingual ASR system must...
    P. Vasuki, Ujjwaleshwar Srikanth, Vijay Sankarnarayanan in Advances in Data-Driven Computing and Intelligent Systems
    Conference paper 2024
  20. Identification of Disfluency Among Children Using Efficient Machine Learning Techniques

    Disfluency, which refers to any deviation from the anticipated fluency of spoken language, is a considerable issue affecting a substantial number of...
    Conference paper 2024
Did you find what you were looking for? Share feedback.