We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.
Filters applied:

Search Results

Showing 1-20 of 10,000 results
  1. Speech-to-SQL: toward speech-driven SQL query generation from natural language question

    Speech-based inputs have been gaining significant momentum with the popularity of smartphones and tablets in our daily lives, since voice is the most...

    Yuanfeng Song, Raymond Chi-Wing Wong, Xuefang Zhao in The VLDB Journal
    Article 16 February 2024
  2. Diversity subspace generation based on feature selection for speech emotion recognition

    Automatic emotion recognition from speech signals is an important research area. Many speech emotion recognition (SER) methods have been proposed,...

    Qing Ye, Yaxin Sun in Multimedia Tools and Applications
    Article 17 August 2023
  3. Fusion-s2igan: an efficient and effective single-stage framework for speech-to-image generation

    The goal of a speech-to-image transform is to produce a photo-realistic picture directly from a speech signal. Current approaches are based on a...

    Zhenxing Zhang, Lambert Schomaker in Neural Computing and Applications
    Article Open access 19 March 2024
  4. Co-speech Gesture Generation with Variational Auto Encoder

    The research field of generating natural gestures from speech input is called co-speech gesture generation. Co-speech generation methods should...
    Shinichi Ka, Koichi Shinoda in MultiMedia Modeling
    Conference paper 2024
  5. Shallow Diffusion Motion Model for Talking Face Generation from Speech

    Talking face generation is synthesizing a lip synchronized talking face video by inputting an arbitrary face image and audio clips. People naturally...
    Xulong Zhang, Jianzong Wang, ... **g **ao in Web and Big Data
    Conference paper 2023
  6. Gammatonegram representation for end-to-end dysarthric speech processing tasks: speech recognition, speaker identification, and intelligibility assessment

    Dysarthria is a disability that causes a disturbance in the human speech system and reduces the quality and intelligibility of a person’s speech....

    Aref Farhadipour, Hadi Veisi in Iran Journal of Computer Science
    Article 10 March 2024
  7. Improvement of automatic speech recognition systems utilizing 2D adaptive wavelet transformation applied to recurrence plot of speech trajectories

    Spectral-based features, typically used in ASR systems, do not capture the phase information of speech signals. Thus, exploiting new features that do...

    Shabnam Firooz, Farshad Almasganj, Yasser Shekofteh in Signal, Image and Video Processing
    Article 15 December 2023
  8. CommanderUAP: a practical and transferable universal adversarial attacks on speech recognition models

    Most of the adversarial attacks against speech recognition systems focus on specific adversarial perturbations, which are generated by adversaries...

    Zheng Sun, **xiao Zhao, ... Lei Ju in Cybersecurity
    Article Open access 05 June 2024
  9. Speech waveform reconstruction from speech parameters for an effective text to speech synthesis system using minimum phase harmonic sinusoidal model for Punjabi

    Speech processing plays a vital role in current speech communication applications. The major objective of digital speech is transmission of messages...

    Navdeep Kaur, Parminder Singh in Multimedia Tools and Applications
    Article 25 March 2022
  10. Speech Enhancement with Generative Diffusion Models

    Abstract

    An alternative approach to speech denoising using generative diffusion models that model the distribution of training data is proposed. In...

    O. V. Girfanov, A. G. Shishkin in Automatic Documentation and Mathematical Linguistics
    Article 01 October 2023
  11. A multitask co-training framework for improving speech translation by leveraging speech recognition and machine translation tasks

    End-to-end speech translation (ST) has attracted substantial attention due to its less error accumulation and lower latency. Based on triplet ST data ...

    Yue Zhou, Yuxuan Yuan, **aodong Shi in Neural Computing and Applications
    Article 27 February 2024
  12. Design and Implementation of Speech Generation and Demonstration Research Based on Deep Learning

    Aiming at complex and changeable factors such as speech theme and environment, which make it difficult for a speaker to prepare the speech text in a...
    Wanyu Luo, Yanqing Wang, ... Yiqin Xu in Data Science
    Conference paper 2023
  13. Audio-guided self-supervised learning for disentangled visual speech representations

    In this paper, we propose a novel two-branch framework to learn the disentangled visual speech representations based on two particular observations....

    Dalu Feng, Shuang Yang, ... **lin Chen in Frontiers of Computer Science
    Article 25 June 2024
  14. Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition

    Dysarthria is a motor speech disorder and the most common neurodegenerative disease characterized by low volume in precise articulation, poor...

    M. Latha, M. Shivakumar, ... M. Keerthi Kumar in SN Computer Science
    Article 20 March 2023
  15. Detecting Speech Disorders Using A Machine-Learning Guided Method in Spontaneous Tunisian Dialect Speech

    This work investigates the disfluencies processing task within the natural spoken language comprehension field. We present a transcription-based...

    Emna Boughariou, Younès Bahou, Lamia Hadrich Belguith in SN Computer Science
    Article 17 April 2024
  16. Speech based emotion recognition by using a faster region-based convolutional neural network

    Automatic emotion identification from speech is a difficult problem that significantly depends on the accuracy of the speech characteristics employed...

    Chappidi Suneetha, Raju Anitha in Multimedia Tools and Applications
    Article 02 April 2024
  17. WaveVC: Speech and Fundamental Frequency Consistent Raw Audio Voice Conversion

    Voice conversion (VC) is a task for changing the speech of a source speaker to the target voice while preserving linguistic information of the source...

    Kyungdeuk Ko, Donghyeon Kim, ... Hanseok Ko in Neural Processing Letters
    Article Open access 08 May 2024
  18. A novel conversational hierarchical attention network for speech emotion recognition in dyadic conversation

    Speech is one of the most fundamental mediums for human-to-human interaction, thereby playing a pivotal role in sha** the landscape of...

    Mohammed Tellai, Lijian Gao, ... Mounir Abdelaziz in Multimedia Tools and Applications
    Article 29 December 2023
  19. Short Speech Key Generation Technology Based on Deep Learning

    With the increasing popularity of biometric identity authentication in important key authentication applications, biometric key generation technology...
    Zhengyin Lv, Zhendong Wu, Juan Chen in Machine Learning for Cyber Security
    Conference paper 2023
  20. Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review

    Automatic speech recognition (ASR) is one of the most fascinating fields of research and the performance of ASR systems is most promising in a closed...

    Mahadevaswamy Shanthamallappa, Kiran Puttegowda, ... Sudheesh Kannur Vasudeva Rao in SN Computer Science
    Article 01 February 2024
Did you find what you were looking for? Share feedback.