Skip to main content

and
  1. Article

    Open Access

    Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech

    With the rise of deep learning, spoken language understanding (SLU) for command-and-control applications such as a voice-controlled virtual assistant can offer reliable hands-free operation to physically disab...

    Pu Wang, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2023)

  2. Article

    Open Access

    Decoding of the speech envelope from EEG using the VLAAI deep neural network

    To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped...

    Bernd Accou, Jonas Vanthornhout, Hugo Van hamme, Tom Francart in Scientific Reports (2023)

  3. Article

    Open Access

    Multi-encoder attention-based architectures for sound recognition with partial visual assistance

    Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries. As a consequence, modalities other than audio can often be exploited to improve the outputs ...

    Wim Boes, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2022)

  4. No Access

    Chapter and Conference Paper

    An Equal Data Setting for Attention-Based Encoder-Decoder and HMM/DNN Models: A Case Study in Finnish ASR

    Standard end-to-end training of attention-based ASR models only uses transcribed speech. If they are compared to HMM/DNN systems, which additionally leverage a large corpus of text-only data and expert-crafted...

    Aku Rouhe, Astrid Van Camp, Mittul Singh, Hugo Van Hamme in Speech and Computer (2021)

  5. Article

    Open Access

    Show me where the action is!

    Reality TV shows have gained popularity, motivating many production houses to bring new variants for us to watch. Compared to traditional TV shows, reality TV shows have spontaneous unscripted footage. Compute...

    Timothy Callemein, Tom Roussel, Ali Diba in Multimedia Tools and Applications (2021)

  6. No Access

    Chapter and Conference Paper

    The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision

    In this paper, we demonstrate a system that automates the process of recording video lectures in classrooms. Through special hardware (lecturer and audience facing cameras and microphone arrays), we record mul...

    Dries Hulens, Bram Aerts, Punarjay Chakravarty, Ali Diba in MultiMedia Modeling (2018)

  7. No Access

    Chapter and Conference Paper

    Automatic Smoker Detection from Telephone Speech Signals

    This paper proposes an automatic smoking habit detection from spontaneous telephone speech signals. In this method, each utterance is modeled using i-vector and non-negative factor analysis (NFA) frameworks, w...

    Amir Hossein Poorjam, Soheila Hesaraki, Saeid Safavi, Hugo van Hamme in Speech and Computer (2017)

  8. Article

    Open Access

    The self-taught vocal interface

    Speech technology is firmly rooted in daily life, most notably in command-and-control (C&C) applications. C&C usability downgrades quickly, however, when used by people with non-standard speech. We pursue a fu...

    Bart Ons, Jort F Gemmeke, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Musi… (2014)

  9. No Access

    Chapter and Conference Paper

    Label Noise Robustness and Learning Speed in a Self-Learning Vocal User Interface

    A self-learning vocal user interface learns to map user-defined spoken commands to intended actions. The voice user interface is trained by mining the speech input and the provoked action on a device. Although...

    Bart Ons, Jort F. Gemmeke, Hugo Van hamme in Natural Interaction with Robots, Knowbots … (2014)

  10. Chapter

    Missing Data Solutions for Robust Speech Recognition

    Current automatic speech recognisers rely for a great deal on statistical models learned from training data. When they are deployed in conditions that differ from those observed in the training data, the gener...

    Yujun Wang, Jort F. Gemmeke, Kris Demuynck in Essential Speech and Language Technology f… (2013)

  11. Chapter

    The JASMIN Speech Corpus: Recordings of Children, Non-natives and Elderly People

    Large speech corpora (LSC) constitute an indispensable resource for conducting research in speech processing and for develo** real-life speech applications. In 2004 the Spoken Dutch Corpus (Corpus Gesproken ...

    Catia Cucchiarini, Hugo Van hamme in Essential Speech and Language Technology for Dutch (2013)

  12. Article

    Open Access

    Multi-candidate missing data imputation for robust speech recognition

    The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations im...

    Yujun Wang, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2012)

  13. No Access

    Article

    Human language technology and communicative disabilities: requirements and possibilities for the future

    For some years now, the Nederlandse Taalunie (Dutch Language Union) has been active in promoting the development of human language technology (HLT) applications for speakers of Dutch with communicative disabil...

    Marina B. Ruiter, Lilian J. Beijer, Catia Cucchiarini in Language Resources and Evaluation (2012)

  14. No Access

    Chapter and Conference Paper

    An On-Line NMF Model for Temporal Pattern Learning: Theory with Application to Automatic Speech Recognition

    Convolutional non-negative matrix factorization (CNMF) can be used to discover recurring temporal (sequential) patterns in sequential vector non-negative data such as spectrograms or posteriorgrams. Drawbacks ...

    Hugo Van Hamme in Latent Variable Analysis and Signal Separation (2012)

  15. Article

    Sparse conjugate directions pursuit with application to fixed-size kernel models

    This work studies an optimization scheme for computing sparse approximate solutions of over-determined linear systems. Sparse Conjugate Directions Pursuit (SCDP) aims to construct a solution using only a small...

    Peter Karsmakers, Kristiaan Pelckmans, Kris De Brabanter in Machine Learning (2011)

  16. No Access

    Chapter and Conference Paper

    Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition

    The Self-Organizing Map (SOM) is widely applied for data clustering and visualization. In this paper, it is used to cluster Gaussians within the Hidden Markov Model (HMM) of the acoustic model for automatic sp...

    Yujun Wang, Hugo Van hamme in Advances in Self-Organizing Maps (2011)

  17. No Access

    Chapter

    Automatic Speech Recognition Using Missing Data Techniques: Handling of Real-World Data

    In this chapter, we investigate the performance of a missing data recognizer on real-world speech from the SPEECON and SpeechDat-Car databases. In previous work we hypothesized that in real-world speech, which...

    Jort F. Gemmeke, Maarten Van Segbroeck in Robust Speech Recognition of Uncertain or … (2011)

  18. No Access

    Chapter and Conference Paper

    On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation

    The discovery of words by young infants involves two interrelated processes: (a) the detection of recurrent word-like acoustic patterns in the speech signal, and (b) cross-modal association between auditory an...

    Louis ten Bosch, Joris Driesen, Hugo Van hamme, Lou Boves in Text, Speech and Dialogue (2009)

  19. Article

    Open Access

    A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

    The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a co...

    Kris Hermus, Patrick Wambacq, Hugo Van hamme in EURASIP Journal on Advances in Signal Proc… (2006)