-
Chapter and Conference Paper
An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Due to the significant cross-modality discrepancy, it is difficult to learn discriminative feat...
-
Chapter and Conference Paper
UAM-Net: An Attention-Based Multi-level Feature Fusion UNet for Remote Sensing Image Segmentation
Semantic segmentation of Remote Sensing Images (RSIs) is an essential application for precision agriculture, environmental protection, and economic assessment. While UNet-based networks have made significant p...
-
Chapter and Conference Paper
MCKIE: Multi-class Key Information Extraction from Complex Documents Based on Graph Convolutional Network
The majority of key information extraction in document analysis work relies on simple layout scenes with few classes, such as the date and amount on invoices or receipts. However, many document applications en...
-
Chapter and Conference Paper
GridIIS: Grid Based Interactive Image Segmentation
Interactive segmentation enables users to specify the object of interest (OOI) via various interaction strategies to obtain accurate segmentation results. An ideal interactive method should efficiently and acc...
-
Chapter and Conference Paper
Pseudo Labels Refinement with Stable Cluster Reconstruction for Unsupervised Re-identification
Most existing unsupervised re-identification uses a clustering-based approach to generate pseudo-labels as supervised signals, allowing deep neural networks to learn discriminative representations without anno...
-
Chapter and Conference Paper
DeCAB: Debiased Semi-supervised Learning for Imbalanced Open-Set Data
Semi-supervised learning (SSL) has received significant attention due to its ability to use limited labeled data and various unlabeled data to train models with high generalization performance. However, the as...
-
Chapter and Conference Paper
Frequency and Spatial Domain Filter Network for Visual Object Tracking
Cross-correlation serves as the core similarity calculation operation in Siamese-based trackers, and generally produces response maps with high values at the target center. During this process, global context,...
-
Chapter and Conference Paper
LLM Collaboration PLM Improves Critical Information Extraction Tasks in Medical Articles
With the development of modern medical informatics and databases, medical professionals are increasingly inclined to use evidence-based medicine to guide their learning and work. Evidence-based medicine requir...
-
Article
CA-CentripetalNet: a novel anchor-free deep learning framework for hardhat wearing detection
To deal with the poor generalization of previous deep learning-based methods, a novel anchor-free deep learning framework called CA-CentripetalNet is proposed for hardhat wearing detection. Two novel schemes a...
-
Article
SPL-Net: Spatial-Semantic Patch Learning Network for Facial Attribute Recognition with Limited Labeled Data
Existing deep learning-based facial attribute recognition (FAR) methods rely heavily on large-scale labeled training data. Unfortunately, in many real-world applications, only limited labeled data are availabl...
-
Article
Active Perception for Visual-Language Navigation
Visual-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments. One of the key challenges in VLN is how to conduct robust navigati...
-
Article
A one-stage deep learning framework for automatic detection of safety harnesses in high-altitude operations
Safety harness plays an essential role in protecting the workers in high-altitude operations from falls from heights. Automatic detection of safety harness wearing is significant for safety management. To deal...
-
Chapter and Conference Paper
A Bibliographic Study of Macular Fovea Detection: AI-Based Methods, Applications, and Issues
This study utilized a method of bibliographic analysis and text mining on the literature from databases of Web of Science (WOS) and Scopus. About 79 and 632 related articles are collected from WOS and Scopus, ...
-
Chapter and Conference Paper
Real-Time Train Rescheduling with Passenger Demand for Rolling Stock Rescue
With the expansion of urban rail transit systems, there are more and more equipment failures, especially train failures, which could result in serious disruptions on the operation of trains. Hence, the effecti...
-
Chapter and Conference Paper
Ocular Tactile Vibration Intervention in VR and Its Modeling Coupled with Visual Fusion
The main application of virtual reality (VR) is to immerse users in the three-dimensional simulation environment and experience the virtual reality world. At present, VR products and content on the market have...
-
Article
Context-Preserving Region-Based Contrastive Learning Framework for Ship Detection in SAR
Ship detection in Synthetic Aperture Radar (SAR) is a challenging task due to the random orientation of the ship and discrete appearance caused by radar signal. In this paper, We introduce a novel unsupervised...
-
Article
Automatic detection of ultrasound breast lesions: a novel saliency detection model based on multiple priors
Due to the complex tissue structure of the breast, breast ultrasound (BUS) images exhibit the characteristics of low-contrast, lesion boundary blurring. Therefore, accurately automatic detection of ultrasound ...
-
Article
Adaptive Deep Disturbance-Disentangled Learning for Facial Expression Recognition
In this paper, we propose a novel adaptive deep disturbance-disentangled learning (ADDL) method for effective facial expression recognition (FER). ADDL involves a two-stage learning procedure. First, a disturb...
-
Chapter and Conference Paper
Study on the Evaluation Method of the Clarity of Critical Areas of Digital Images
In response to the current needs of machine vision systems for digital image clarity evaluation methods, this paper proposes a scientific and objective image critical area clarity evaluation method, focusing o...
-
Chapter and Conference Paper
Semantic-Aware Non-local Network for Handwritten Mathematical Expression Recognition
Handwritten mathematical expression recognition (HMER) is a challenging task due to its complex two-dimensional structure of mathematical expressions and the high similarity between handwritten texts. Most exi...