Advances in Multimedia Information Processing – PCM 2012
13th Pacific-Rim Conference on Multimedia, Singapore, December 4-6, 2012. Proceedings
Chapter and Conference Paper
Image captioning is a multimodal task involving computer vision and natural language processing, where the goal is to learn a map** from the image to its natural language description. In general, the map**...
Chapter and Conference Paper
This paper presents a Quadtree Convolutional Neural Network (QCNN) for efficiently learning from image datasets representing sparse data such as handwriting, pen strokes, freehand sketches, etc. Instead of sto...
Chapter and Conference Paper
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pair...
Chapter and Conference Paper
Object detection is one of the major problems in computer vision, and has been extensively studied. Most of the existing detection works rely on labor-intensive supervision, such as ground truth bounding boxes...
Chapter and Conference Paper
Due to the fact that it is prohibitively expensive to completely annotate visual relationships, i.e., the (obj1, rel, obj2) triplets, relationship models are inevitably biased to object classes of limited pairwis...
Chapter and Conference Paper
Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated tra...
Chapter and Conference Paper
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the s...
Article
Metamodeling is becoming a rather popular means to approximate the expensive simulations in today’s complex engineering design problems since accurate metamodels can bring in a lot of benefits. The metamodel a...
Chapter and Conference Paper
Facial action unit (AU) detection and face alignment are two highly correlated tasks since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU det...
Chapter and Conference Paper
In this paper, we consider Fisher vector in the context of domain adaptation, which has rarely been discussed by the existing domain adaptation methods. Particularly, in many real scenarios, the distributions ...
Chapter and Conference Paper
Video co-localization is the task of jointly localizing common objects across videos. Due to the appearance variations both across the videos and within the video, it is a challenging problem to identify and t...
Chapter and Conference Paper
Multi-label learning has attracted significant interests in computer vision recently, finding applications in many vision tasks such as multiple object recognition and automatic image annotation. Associating m...
Article
The existing high-quality environment matting methods usually require the capturing of a few thousand sample images and spend a few hours in data acquisition. In this paper, a novel environment matting algorit...
Chapter and Conference Paper
We propose a new type of saliency as inspired by findings from visual search studies - the searching difficulty is correlated with the target-distractor contrast, the distractor homogeneity, as well as the tar...
Chapter and Conference Paper
The existing cosegmentation methods use intra-group information to extract a common object from a single image group. Observing that in many practical scenarios there often exist multiple image groups with dis...
Chapter and Conference Paper
Existing depth recovery methods for commodity RGB-D sensors primarily rely on low-level information for repairing the measured depth estimates. However, as the distance of the scene from the camera increases, ...
Chapter and Conference Paper
Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly lear...
Chapter and Conference Paper
Jointly segmenting common objects from multiple images remains a challenging problem. In this paper, we propose a multi-class cosegmentation method based on correlation clustering, which requires no prior know...
Article
Adrenocortical carcinoma (ACC) is a rare endocrine malignancy accounting for approximately 0.02–0.2% of all cancer deaths. The molecular pathogenesis of ACC has been the hot topic of recent reviews but it is s...
Book and Conference Proceedings
13th Pacific-Rim Conference on Multimedia, Singapore, December 4-6, 2012. Proceedings