Cognitively Inspired Audiovisual Speech Filtering
Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System
Chapter and Conference Paper
Eye blinking has been studied extensively due to its wide range of potential applications. However, one under-researched field is the use of the wider lacrimal area for detection. This paper proposes a new eye...
Article
Evaluation trials are crucial to measure performance of speaker verification systems. However, the design of trials that can faithfully reflect system performance and accurately distinguish between different s...
Chapter and Conference Paper
Over the past three decades, there has been sustained research activity in emotion recognition from faces, powered by the popularity of smart devices and the development of improved machine learning, resulting...
Chapter and Conference Paper
Human speech processing is a multimodal and cognitive activity, with visual information playing a role. Many lipreading systems use English speech data, however, Chinese is the most spoken language in the worl...
Chapter and Conference Paper
Modern life is ever more reliant on computers being able to classify the world around them and computer vision is one of the ways computers do it. Nowadays, due to the advent of reliable and low-cost range sen...
Chapter and Conference Paper
This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in...
Chapter and Conference Paper
Games are both a way to enjoy leisure time and to learn. Understanding how mental processes associated with gaming work at a deeper level is very important, especially with emerging technologies such as consum...
Article
In recent years there has been significant interest in reversible data hiding, and also in particular, reversible data hiding in encrypted images (RDH-EI). This means that additional data can be embedded into ...
Article
To populate knowledge repositories, such as WordNet, Freebase and NELL, two branches of research have grown separately for decades. On the one hand, corpus-based methods which leverage unstructured free texts ...
Article
Article
Humans and animals are able to segment visual scenes by having the natural cognitive ability to quickly identify salient objects in both static and dynamic scenes. In this paper, we present a new spatio-tempor...
Chapter and Conference Paper
The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics u...
Book
Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System
Chapter
This chapter presents a summary of the general research domain, and also the relationship between audio and visual aspects of speech. The background to human speech production is briefly discussed, along with ...
Chapter
The speech enhancement research presented here was motivated by several factors. Firstly, the development in recent years of audio-only hearing aids that utilise sophisticated decision rules to determine the a...
Chapter
Previous research developments in the field of speech enhancement (such as multi microphone arrays and speech enhancement algorithms) have been implemented into commercial hearing aids for the benefit of the d...
Chapter
The overall aim of this work is to utilise the relationship between audio and visual aspects of speech in order to develop a speech enhancement system. This chapter provides a detailed description of the initi...
Chapter
After an investigation of state-of-the-art research in Chap. 3, Chap. 4 proposed a new two-stage audiovisual speech enhancemen...
Chapter
This chapter presents a literature review that places the research proposed in this book in context, building on the background presented in the previous chapters. Firstly, the overall speech processing domain...
Chapter
As discussed in Chap. 2, the multimodal nature of both human speech production and perception is well established. The work presented in this book has utilised this mult...