![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Chapter and Conference Paper
IVIST: Interactive Video Search Tool in VBS 2022
This paper presents the details of the proposed video retrieval tool, named Interactive VIdeo Search Tool (IVIST) for the Video Browser Showdown (VBS) 2022. In order to retrieve desired videos from a multimedi...
-
Chapter and Conference Paper
Speaker-Adaptive Lip Reading with User-Dependent Padding
Lip reading aims to predict speech based on lip movements alone. As it focuses on visual information to model the speech, its performance is inherently sensitive to personal lip appearances and movements. This...
-
Chapter and Conference Paper
Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment
Retrieving desired videos using natural language queries has attracted increasing attention in research and industry fields as a huge number of videos appear on the internet. Some existing methods attempted to...
-
Chapter and Conference Paper
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
The goal of this work is to reconstruct speech from a silent talking face video. Recent studies have shown impressive performance on synthesizing speech from silent talking face videos. However, they have not ...
-
Chapter and Conference Paper
Robust Multispectral Pedestrian Detection via Uncertainty-Aware Cross-Modal Learning
With the development of deep neural networks, multispectral pedestrian detection has been received a great attention by exploiting complementary properties of multiple modalities (e.g., color-visible and thermal ...
-
Chapter and Conference Paper
IVIST: Interactive Video Search Tool in VBS 2021
This paper presents a new version of the Interactive VIdeo Search Tool (IVIST), a video retrieval tool, for the participation of the Video Browser Showdown (VBS) 2021. In the previous IVIST (VBS 2020), there w...
-
Chapter and Conference Paper
Correction to: MultiMedia Modeling
The original version of this book was revised. Due to a technical error, the first volume editor did not appear in the volumes of the MMM 2020 proceedings. This was corrected and the first volume editor was ad...
-
Chapter and Conference Paper
IVIST: Interactive VIdeo Search Tool in VBS 2020
This paper presents a new video retrieval tool, Interactive VIdeo Search Tool (IVIST), which participates in the 2020 Video Browser Showdown (VBS). As a video retrieval tool, IVIST is equipped with proper and...
-
Chapter and Conference Paper
Face Tells Detailed Expression: Generating Comprehensive Facial Expression Sentence Through Facial Action Units
Human facial expression plays the key role in the understanding of the social behavior. Many deep learning approaches present facial emotion recognition and automatic image captioning considering human sentime...
-
Chapter and Conference Paper
Deep Learning-Based Video Retrieval Using Object Relationships and Associated Audio Classes
This paper introduces a video retrieval tool for the 2020 Video Browser Showdown (VBS). The tool enhances the user’s video browsing experience by ensuring full use of video analysis database constructed prior ...
-
Chapter and Conference Paper
Correction to: MultiMedia Modeling
The original version of this book was revised. Due to a technical error, the first volume editor did not appear in the volumes of the MMM 2020 proceedings. A funding number was missing in the acknowledgement s...
-
Chapter and Conference Paper
SACA Net: Cybersickness Assessment of Individual Viewers for VR Content via Graph-Based Symptom Relation Embedding
Recently, cybersickness assessment for VR content is required to deal with viewing safety issues. Assessing physical symptoms of individual viewers is challenging but important to provide detailed and personal...
-
Chapter and Conference Paper
Feature2Mass: Visual Feature Processing in Latent Space for Realistic Labeled Mass Generation
This paper deals with a method for generating realistic labeled masses. Recently, there have been many attempts to apply deep learning to various bio-image computing fields including computer-aided detection a...
-
Chapter and Conference Paper
Photo-Realistic Facial Emotion Synthesis Using Multi-level Critic Networks with Multi-level Generative Model
In this paper, we propose photo-realistic facial emotion synthesis by using a novel multi-level critic network with multi-level generative model. We devise a new facial emotion generator containing the propose...
-
Chapter and Conference Paper
Generation of Multimodal Justification Using Visual Word Constraint Model for Explainable Computer-Aided Diagnosis
The ambiguity of the decision-making process has been pointed out as the main obstacle to practically applying the deep learning-based method in spite of its outstanding performance. Interpretability can guara...
-
Chapter and Conference Paper
Realistic Breast Mass Generation Through BIRADS Category
Generating realistic breast masses is a highly important task because the large-size database of annotated breast masses is scarcely available. In this study, a novel realistic breast mass generation framework...
-
Chapter and Conference Paper
Teacher and Student Joint Learning for Compact Facial Landmark Detection Network
Compact neural networks with limited memory and computation are demanding in recently popularized mobile applications. The reduction of network parameters is an important priority. In this paper, we address a ...
-
Chapter and Conference Paper
Convolution with Logarithmic Filter Groups for Efficient Shallow CNN
In convolutional neural networks (CNNs), the filter grou** in convolution layers is known to be useful to reduce the network parameter size. In this paper, we propose a new logarithmic filter grou** which ...
-
Chapter and Conference Paper
Facial Dynamics Interpreter Network: What Are the Important Relations Between Local Dynamics for Facial Trait Estimation?
Human face analysis is an important task in computer vision. According to cognitive-psychological studies, facial dynamics could provide crucial cues for face analysis. The motion of a facial local region in f...
-
Chapter and Conference Paper
Learning Features Robust to Image Variations with Siamese Networks for Facial Expression Recognition
This paper proposes a computationally efficient method for learning features robust to image variations for facial expression recognition (FER). The proposed method minimizes the feature difference between an ...