Search Results - Springer

Sort By Newest First Oldest First

Article

Open Access

Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech

With the rise of deep learning, spoken language understanding (SLU) for command-and-control applications such as a voice-controlled virtual assistant can offer reliable hands-free operation to physically disab...

Pu Wang, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2023)

Download PDF (3909 KB) View Article
Article

Open Access

Decoding of the speech envelope from EEG using the VLAAI deep neural network

To investigate the processing of speech in the brain, commonly simple linear models are used to establish a relationship between brain signals and speech features. However, these linear models are ill-equipped...

Bernd Accou, Jonas Vanthornhout, Hugo Van hamme, Tom Francart in Scientific Reports (2023)

Download PDF (1465 KB) View Article
Article

Open Access

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries. As a consequence, modalities other than audio can often be exploited to improve the outputs ...

Wim Boes, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2022)

Download PDF (2082 KB) View Article
Chapter and Conference Paper

An Equal Data Setting for Attention-Based Encoder-Decoder and HMM/DNN Models: A Case Study in Finnish ASR

Standard end-to-end training of attention-based ASR models only uses transcribed speech. If they are compared to HMM/DNN systems, which additionally leverage a large corpus of text-only data and expert-crafted...

Aku Rouhe, Astrid Van Camp, Mittul Singh, Hugo Van Hamme… in Speech and Computer (2021)
Article

Open Access

Show me where the action is!

Reality TV shows have gained popularity, motivating many production houses to bring new variants for us to watch. Compared to traditional TV shows, reality TV shows have spontaneous unscripted footage. Compute...

Timothy Callemein, Tom Roussel, Ali Diba… in Multimedia Tools and Applications (2021)

Download PDF (3688 KB) View Article
Chapter and Conference Paper

The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision

In this paper, we demonstrate a system that automates the process of recording video lectures in classrooms. Through special hardware (lecturer and audience facing cameras and microphone arrays), we record mul...

Dries Hulens, Bram Aerts, Punarjay Chakravarty, Ali Diba… in MultiMedia Modeling (2018)
Chapter and Conference Paper

Automatic Smoker Detection from Telephone Speech Signals

This paper proposes an automatic smoking habit detection from spontaneous telephone speech signals. In this method, each utterance is modeled using i-vector and non-negative factor analysis (NFA) frameworks, w...

Amir Hossein Poorjam, Soheila Hesaraki, Saeid Safavi, Hugo van Hamme… in Speech and Computer (2017)
Article

Open Access

The self-taught vocal interface

Speech technology is firmly rooted in daily life, most notably in command-and-control (C&C) applications. C&C usability downgrades quickly, however, when used by people with non-standard speech. We pursue a fu...

Bart Ons, Jort F Gemmeke, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Musi… (2014)

Download PDF (1595 KB) View Article
Chapter and Conference Paper

Label Noise Robustness and Learning Speed in a Self-Learning Vocal User Interface

A self-learning vocal user interface learns to map user-defined spoken commands to intended actions. The voice user interface is trained by mining the speech input and the provoked action on a device. Although...

Bart Ons, Jort F. Gemmeke, Hugo Van hamme in Natural Interaction with Robots, Knowbots … (2014)
Chapter

Missing Data Solutions for Robust Speech Recognition

Current automatic speech recognisers rely for a great deal on statistical models learned from training data. When they are deployed in conditions that differ from those observed in the training data, the gener...

Yujun Wang, Jort F. Gemmeke, Kris Demuynck… in Essential Speech and Language Technology f… (2013)

Download PDF (305 KB) View Chapter
Chapter

The JASMIN Speech Corpus: Recordings of Children, Non-natives and Elderly People

Large speech corpora (LSC) constitute an indispensable resource for conducting research in speech processing and for develo** real-life speech applications. In 2004 the Spoken Dutch Corpus (Corpus Gesproken ...

Catia Cucchiarini, Hugo Van hamme in Essential Speech and Language Technology for Dutch (2013)

Download PDF (188 KB) View Chapter
Article

Open Access

Multi-candidate missing data imputation for robust speech recognition

The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations im...

Yujun Wang, Hugo Van hamme in EURASIP Journal on Audio, Speech, and Music Processing (2012)

Download PDF (733 KB) View Article
Article

Human language technology and communicative disabilities: requirements and possibilities for the future

For some years now, the Nederlandse Taalunie (Dutch Language Union) has been active in promoting the development of human language technology (HLT) applications for speakers of Dutch with communicative disabil...

Marina B. Ruiter, Lilian J. Beijer, Catia Cucchiarini… in Language Resources and Evaluation (2012)
Chapter and Conference Paper

An On-Line NMF Model for Temporal Pattern Learning: Theory with Application to Automatic Speech Recognition

Convolutional non-negative matrix factorization (CNMF) can be used to discover recurring temporal (sequential) patterns in sequential vector non-negative data such as spectrograms or posteriorgrams. Drawbacks ...

Hugo Van Hamme in Latent Variable Analysis and Signal Separation (2012)
Article

Sparse conjugate directions pursuit with application to fixed-size kernel models

This work studies an optimization scheme for computing sparse approximate solutions of over-determined linear systems. Sparse Conjugate Directions Pursuit (SCDP) aims to construct a solution using only a small...

Peter Karsmakers, Kristiaan Pelckmans, Kris De Brabanter… in Machine Learning (2011)

Download PDF (1661 KB)
Chapter and Conference Paper

Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition

The Self-Organizing Map (SOM) is widely applied for data clustering and visualization. In this paper, it is used to cluster Gaussians within the Hidden Markov Model (HMM) of the acoustic model for automatic sp...

Yujun Wang, Hugo Van hamme in Advances in Self-Organizing Maps (2011)
Chapter

Automatic Speech Recognition Using Missing Data Techniques: Handling of Real-World Data

In this chapter, we investigate the performance of a missing data recognizer on real-world speech from the SPEECON and SpeechDat-Car databases. In previous work we hypothesized that in real-world speech, which...

Jort F. Gemmeke, Maarten Van Segbroeck… in Robust Speech Recognition of Uncertain or … (2011)
Chapter and Conference Paper

On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation

The discovery of words by young infants involves two interrelated processes: (a) the detection of recurrent word-like acoustic patterns in the speech signal, and (b) cross-modal association between auditory an...

Louis ten Bosch, Joris Driesen, Hugo Van hamme, Lou Boves in Text, Speech and Dialogue (2009)
Article

Open Access

A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a co...

Kris Hermus, Patrick Wambacq, Hugo Van hamme in EURASIP Journal on Advances in Signal Proc… (2006)

Download PDF (840 KB)

19 Result(s)

Benefits of pre-trained mono- and cross-lingual speech representations for spoken language understanding of Dutch dysarthric speech

Decoding of the speech envelope from EEG using the VLAAI deep neural network

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

An Equal Data Setting for Attention-Based Encoder-Decoder and HMM/DNN Models: A Case Study in Finnish ASR

Show me where the action is!

The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision

Automatic Smoker Detection from Telephone Speech Signals

The self-taught vocal interface

Label Noise Robustness and Learning Speed in a Self-Learning Vocal User Interface

Missing Data Solutions for Robust Speech Recognition

The JASMIN Speech Corpus: Recordings of Children, Non-natives and Elderly People

Multi-candidate missing data imputation for robust speech recognition

Human language technology and communicative disabilities: requirements and possibilities for the future

An On-Line NMF Model for Temporal Pattern Learning: Theory with Application to Automatic Speech Recognition

Sparse conjugate directions pursuit with application to fixed-size kernel models

Gaussian Selection Using Self-Organizing Map for Automatic Speech Recognition

Automatic Speech Recognition Using Missing Data Techniques: Handling of Real-World Data

On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation

A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

Our Content

Other Sites

Help & Contacts