![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Article
Open AccessDynamic Context Removal: A General Training Strategy for Robust Models on Video Action Predictive Tasks
Predicting future actions is an essential feature of intelligent systems and embodied AI. However, compared to the traditional recognition tasks, the uncertainty of the future and the reasoning ability require...
-
Article
InstaBoost++: Visual Coherence Principles for Unified 2D/3D Instance Level Data Augmentation
Instance-level perception tasks like object detection, instance segmentation, and 3D detection require many training samples to achieve satisfactory performance. The meticulous labels for these tasks are usual...
-
Article
Cortical ensembles orchestrate social competition through hypothalamic outputs
Most social species self-organize into dominance hierarchies1,2, which decreases aggression and conserves energy3,4, but it is not clear how individuals know their social rank. We have only begun to learn how the...
-
Chapter and Conference Paper
Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection
Human-Object Interaction (HOI) detection plays a crucial role in activity understanding. Though significant progress has been made, interactiveness learning remains a challenging problem in HOI detection: exis...
-
Chapter and Conference Paper
D &D: Learning Human Dynamics from Dynamic Camera
3D human pose estimation from a monocular video has recently seen significant improvements. However, most state-of-the-art methods are kinematics-based, which are prone to physically implausible motions with p...
-
Chapter and Conference Paper
Constructing Balance from Imbalance for Long-Tailed Image Recognition
Long-tailed image recognition presents massive challenges to deep learning systems since the imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural net...
-
Chapter and Conference Paper
Unsupervised Visual Representation Learning by Synchronous Momentum Grou**
In this paper, we propose a genuine group-level contrastive visual representation learning method whose linear evaluation performance on ImageNet surpasses the vanilla supervised learning. Two mainstream unsup...
-
Article
Towards a new generation of artificial intelligence in China
Artificial intelligence has become a main driving force for a new round of industrial transformation around the world. Many countries including China are seizing the opportunity of the AI revolution to promote...
-
Article
Complex sequential understanding through the awareness of spatial and temporal concepts
Understanding sequential information is a fundamental task for artificial intelligence. Current neural networks attempt to learn spatial and temporal information as a whole, limiting their abilities to represe...
-
Chapter and Conference Paper
Asynchronous Interaction Aggregation for Action Detection
Understanding interaction is an essential part of video action detection. We propose the Asynchronous Interaction Aggregation network (AIA) that leverages different interactions to boost action detection. Ther...
-
Chapter and Conference Paper
HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-person 3D Pose Estimation
Remarkable progress has been made in 3D human pose estimation from a monocular RGB camera. However, only a few studies explored 3D multi-person cases. In this paper, we attempt to address the lack of a global ...
-
Chapter and Conference Paper
Human Correspondence Consensus for 3D Object Semantic Understanding
Semantic understanding of 3D objects is crucial in many applications such as object manipulation. However, it is hard to give a universal definition of point-level semantics that everyone would agree on. We ob...
-
Article
Fast Abnormal Event Detection
Fast abnormal event detection meets the growing demand to process an enormous number of surveillance videos. Based on the inherent redundancy of video structures, we propose an efficient sparse combination lea...
-
Chapter and Conference Paper
Attention-Based Audio-Visual Fusion for Video Summarization
Video summarization compresses videos while preserving the most meaningful content for users. Many image-based works focus on how to effectively utilize video visual cues to choose keyframes. However, apart fr...
-
Chapter and Conference Paper
Pairwise Body-Part Attention for Recognizing Human-Object Interactions
In human-object interactions (HOI) recognition, conventional methods consider the human body as a whole and pay a uniform attention to the entire body region. They ignore the fact that normally, human interact...
-
Chapter and Conference Paper
SRDA: Generating Instance Segmentation Annotation via Scanning, Reasoning and Domain Adaptation
Instance segmentation is a problem of significance in computer vision. However, preparing annotated data for this task is extremely time-consuming and costly. By combining the advantages of 3D scanning, reason...
-
Chapter and Conference Paper
Visual Relationship Detection with Language Priors
Visual relationships capture a wide variety of interactions between pairs of objects in images (e.g. “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible relationships is extremel...
-
Article
Contrast Preserving Decolorization with Perception-Based Quality Metrics
Converting color images into grayscale ones suffer from information loss. In the meantime, it is one fundamental tool indispensable for single channel image processing, digital printing, and monotone e-ink dis...
-
Article
Clutter suppression method in GPR using particle clustering
In this paper, a novel clutter suppression method in Ground Penetrating Radar (GPR) is proposed. Time segments of hill are represented by their corresponding particle in B-scan. Those particles in B-scan are c...
-
Article
Statistical approximation based fine frequency synchronization for OFDM systems
The paper proposes a novel approach for fine frequency synchronization of OFDM synchronization systems in multi-path channels. Maximum Likelihood (ML) function of frequency offsets including integral and decim...