We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.

Search Results

Showing 1-20 of 535 results
  1. Fake speech detection using VGGish with attention block

    While deep learning technologies have made remarkable progress in generating deepfakes, their misuse has become a well-known concern. As a result,...

    Tahira Kanwal, Rabbia Mahum, ... Haseeb Hassan in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 26 June 2024
  2. GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration

    Polyphonic sound source localization and detection (SSLD) task aims to recognize the categories of sound events, identify their onset and offset...

    Mengzhen Ma, Ying Hu, ... Hao Huang in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 26 June 2024
  3. Automatic dysarthria detection and severity level assessment using CWT-layered CNN model

    Dysarthria is a speech disorder that affects the ability to communicate due to articulation difficulties. This research proposes a novel method for...

    Shaik Sajiha, Kodali Radha, ... Durga Prasad Bavirisetti in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 25 June 2024
  4. Estimating the first and second derivatives of discrete audio data

    A new method for estimating the first and second derivatives of discrete audio signals intended to achieve higher computational precision in...

    Article Open access 18 June 2024
  5. MIRACLE—a microphone array impulse response dataset for acoustic learning

    This work introduces a large dataset comprising impulse responses of spatially distributed sources within a plane parallel to a planar microphone...

    Adam Kujawski, Art J. R. Pelling, Ennes Sarradj in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 18 June 2024
  6. Music time signature detection using ResNet18

    Time signature detection is a fundamental task in music information retrieval, aiding in music organization. In recent years, the demand for robust...

    Jeremiah Abimbola, Daniel Kostrzewa, Pawel Kasprowski in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 13 June 2024
  7. Exploration of Whisper fine-tuning strategies for low-resource ASR

    Limited data availability remains a significant challenge for Whisper’s low-resource speech recognition performance, falling short of practical...

    Yunpeng Liu, Xukui Yang, Dan Qu in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 01 June 2024
  8. Optimizing feature fusion for improved zero-shot adaptation in text-to-speech synthesis

    In the era of advanced text-to-speech (TTS) systems capable of generating high-fidelity, human-like speech by referring a reference speech, voice...

    Zhiyong Chen, Zhiqi Ai, ... Shugong Xu in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 28 May 2024
  9. Towards multidimensional attentive voice tracking—estimating voice state from auditory glimpses with regression neural networks and Monte Carlo sampling

    Selective attention is a crucial ability of the auditory system. Computationally, following an auditory object can be illustrated as tracking its...

    Joanna Luberadzka, Hendrik Kayser, ... Volker Hohmann in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 22 May 2024
  10. Sampling the user controls in neural modeling of audio devices

    This work studies neural modeling of nonlinear parametric audio circuits, focusing on how the diversity of settings of the target device user...

    Otto Mikkonen, Alec Wright, Vesa Välimäki in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 20 May 2024
  11. Continuous lipreading based on acoustic temporal alignments

    Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art...

    David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 06 May 2024
  12. Mi-Go: tool which uses YouTube as data source for evaluating general-purpose speech recognition machine learning models

    This article introduces Mi-Go, a tool aimed at evaluating the performance and adaptability of general-purpose speech recognition machine learning...

    Tomasz Wojnar, Jarosław Hryszko, Adam Roman in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 01 May 2024
  13. Exploring the power of pure attention mechanisms in blind room parameter estimation

    Dynamic parameterization of acoustic environments has drawn widespread attention in the field of audio processing. Precise representation of local...

    Chunxi Wang, Maoshen Jia, ... Wenyu ** in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 24 April 2024
  14. Robust acoustic reflector localization using a modified EM algorithm

    In robotics, echolocation has been used to detect acoustic reflectors, e.g., walls, as it aids the robotic platform to navigate in darkness and also...

    Usama Saqib, Mads Græsbøll Christensen, Jesper Rindom Jensen in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 18 April 2024
  15. Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement

    Speech signals are often distorted by reverberation and noise, with a widely distributed signal-to-noise ratio (SNR). To address this, our study...

    Zehua Zhang, Lu Zhang, ... Mingjiang Wang in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 11 April 2024
  16. Multi-rate modulation encoding via unsupervised learning for audio event detection

    Technologies in healthcare, smart homes, security, ecology, and entertainment all deploy audio event detection (AED) in order to detect sound events...

    Sandeep Reddy Kothinti, Mounya Elhilali in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 01 April 2024
  17. DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection

    Spoofed speeches are becoming a big threat to society due to advancements in artificial intelligence techniques. Therefore, there must be an...

    Rabbia Mahum, Aun Irtaza, ... Haseeb Hassan in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 01 April 2024
  18. Synthesis of soundfields through irregular loudspeaker arrays based on convolutional neural networks

    Most soundfield synthesis approaches deal with extensive and regular loudspeaker arrays, which are often not suitable for home audio systems, due to...

    Luca Comanducci, Fabio Antonacci, Augusto Sarti in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 28 March 2024
  19. An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment

    Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an...

    Shivam Saini, Isaac Engel, Jürgen Peissig in EURASIP Journal on Audio, Speech, and Music Processing
    Article Open access 27 March 2024
Did you find what you were looking for? Share feedback.