-
Chapter and Conference Paper
Unpaired Image Captioning by Language Pivoting
Image captioning is a multimodal task involving computer vision and natural language processing, where the goal is to learn a map** from the image to its natural language description. In general, the map**...
-
Chapter and Conference Paper
Quadtree Convolutional Neural Networks
This paper presents a Quadtree Convolutional Neural Network (QCNN) for efficiently learning from image datasets representing sparse data such as handwriting, pen strokes, freehand sketches, etc. Instead of sto...
-
Chapter and Conference Paper
T \(^2\) Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pair...
-
Chapter and Conference Paper
Zero-Annotation Object Detection with Web Knowledge Transfer
Object detection is one of the major problems in computer vision, and has been extensively studied. Most of the existing detection works rely on labor-intensive supervision, such as ground truth bounding boxes...
-
Chapter and Conference Paper
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
Due to the fact that it is prohibitively expensive to completely annotate visual relationships, i.e., the (obj1, rel, obj2) triplets, relationship models are inevitably biased to object classes of limited pairwis...
-
Chapter and Conference Paper
Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated tra...
-
Chapter and Conference Paper
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the s...
-
Chapter and Conference Paper
Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
Facial action unit (AU) detection and face alignment are two highly correlated tasks since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU det...
-
Chapter and Conference Paper
Domain Adaptive Fisher Vector for Visual Recognition
In this paper, we consider Fisher vector in the context of domain adaptation, which has rarely been discussed by the existing domain adaptation methods. Particularly, in many real scenarios, the distributions ...
-
Chapter and Conference Paper
CATS: Co-saliency Activated Tracklet Selection for Video Co-localization
Video co-localization is the task of jointly localizing common objects across videos. Due to the appearance variations both across the videos and within the video, it is a challenging problem to identify and t...
-
Chapter and Conference Paper
Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations
Multi-label learning has attracted significant interests in computer vision recently, finding applications in many vision tasks such as multiple object recognition and automatic image annotation. Associating m...
-
Chapter and Conference Paper
Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling
Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly lear...
-
Article
Open AccessChannel Resource Allocation for VoIP Applications in Collaborative IEEE 802.11/802.16 Networks
Collaborations between the IEEE 802.11 and the IEEE 802.16 networks operating in a common spectrum offers dynamic allocate bandwidth resources to achieve improved performance for network applications. This pap...
-
Article
Open AccessIntroduction to the Special Issue on Wireless Video
-
Chapter and Conference Paper
Medium Access Cooperations for Improving VoIP Capacity over Hybrid 802.16/802.11 Cognitive Radio Networks
There are some existing works that study the coexistence of 802.16 and 802.11 networks. However, not many of them consider the resource allocation issues in the case of delivering traffic between mobile statio...