Search
Search Results
-
Multi-scale hash encoding based neural geometry representation
Recently, neural implicit function-based representation has attracted more and more attention, and has been widely used to represent surfaces using...
-
Learning multi-level and multi-scale deep representations for privacy image classification
Privacy image classification can help people detect privacy images when people share images. In this paper, we propose a novel method using...
-
Relational multi-scale metric learning for few-shot knowledge graph completion
Few-shot knowledge graph completion (FKGC) refers to the task of inferring missing facts in a knowledge graph by utilizing a limited number of...
-
Retinal artery/vein classification by multi-channel multi-scale fusion network
The automatic artery/vein (A/V) classification in retinal fundus images plays a significant role in detecting vascular abnormalities and could speed...
-
MLANet: multi-level attention network with multi-scale feature fusion for crowd counting
Estimating the population in a given scene is a process known as crowd counting. The field has recently garnered significant attention, and many...
-
TAE: Topic-aware encoder for large-scale multi-label text classification
Convolutional neural networks, recurrent neural networks, and transformers have excelled in representation learning for large-scale multi-label text...
-
Person re-identification based on multi-scale feature fusion and multi-attention mechanism
Person re-identification is an image retrieval technique for person in real scenes. Due to factors such as camera angle, lighting, and occlusion,...
-
Multi-view Self-supervised Learning and Multi-scale Feature Fusion for Automatic Speech Recognition
To address the challenges of the poor representation capability and low data utilization rate of end-to-end speech recognition models in deep...
-
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey
With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as bidirectional encoder representations (BERT),...
-
Accurate Facial Landmark Detector via Multi-scale Transformer
Facial landmark detection is an essential prerequisite for many face applications, which has attracted much attention and made remarkable progress in... -
PointCMC: cross-modal multi-scale correspondences learning for point cloud understanding
Existing cross-modal frameworks have achieved impressive performance in point cloud object representations learning, where a 2D image encoder is...
-
Lightweight multi-scale network with attention for accurate and efficient crowd counting
Crowd counting is a significant task in computer vision, which aims to estimate the total number of people appeared in images or videos. However, it...
-
MPA-GNet: multi-scale parallel adaptive graph network for 3D human pose estimation
Graph convolutional networks (GCNs) have achieved remarkable performance in the 2D-to-3D human pose estimation (HPE) task. The adjacency matrix in...
-
3D Human pose estimation from video via multi-scale multi-level spatial temporal features
In this paper, we present an innovative framework for 2D-to-3D human pose estimation from video, harnessing the power of multi-scale multi-level...
-
MSGNN: Multi-scale Spatio-temporal Graph Neural Network for epidemic forecasting
Infectious disease forecasting has been a key focus and proved to be crucial in controlling epidemic. A recent trend is to develop forecasting models...
-
MMFL-net: multi-scale and multi-granularity feature learning for cross-domain fashion retrieval
Instance-level image retrieval in fashion industry is a challenging issue owing to its increasing importance in real-scenario visual fashion search....
-
Crowd Counting based on Multi-level Multi-scale Feature
Crowd counting has drawn more and more attention for its significance in reality application. However, it’s still a challenging task because of scale...
-
2MGAS-Net: multi-level multi-scale gated attentional squeezed network for polyp segmentation
Accurate segmentation of colon polyps in endoscopic images is crucial for early colorectal cancer diagnosis and treatment planning. However,...
-
GaitASMS: gait recognition by adaptive structured spatial representation and multi-scale temporal aggregation
Gait recognition is one of the most promising video-based biometric technologies. The edge of silhouettes and motion are the most informative feature...
-
MS-RAFT+: High Resolution Multi-Scale RAFT
Hierarchical concepts have proven useful in many classical and learning-based optical flow methods regarding both accuracy and robustness. In this...