![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
1,803 Result(s)
-
Article
Progressive spatial–temporal transfer model for unsupervised person re-identification
Over the past decade, a more widespread area of computer vision research has been person re-identification (P-Reid). This technology is applied in fields such as pedestrian tracking, security, and video survei...
-
Article
Parameter-efficient tuning of cross-modal retrieval for a specific database via trainable textual and visual prompts
A novel cross-modal image retrieval method realized by parameter efficiently tuning a pre-trained cross-modal model is proposed in this study. Conventional cross-modal retrieval methods realize text-to-image r...
-
Article
PSNet: position-shift alignment network for image caption
Recently, Transformer-based models have gained increasing popularity in the field of image captioning. The global attention mechanism of the Transformer facilitates the integration of region and grid features,...
-
Article
A lightweight small object detection algorithm based on improved YOLOv5 for driving scenarios
Small object detection has been a longstanding challenge in the field of object detection, and achieving high detection accuracy is crucial for autonomous driving, especially for small objects. This article fo...
-
Article
CoCoOpter: Pre-train, prompt, and fine-tune the vision-language model for few-shot image classification
Few-shot image classification aims at learning to generalize to unseen new categories from a few training samples. Transfer learning is one prominent approach to the task, which first learns a backbone from th...
-
Article
Style-aware adversarial pairwise ranking for image recommendation systems
The vulnerability of Machine Learning (ML) models to adversarial attack and their prominence pose security issues, notably in image recommendation systems. The adversarial training method is an excellent strat...
-
Chapter and Conference Paper
CMC_v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors
This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022). In our approach, we employ the win...
-
Chapter and Conference Paper
Boosting COVID-19 Severity Detection with Infection-Aware Contrastive Mixup Classification
This paper presents our solution for the 2nd COVID-19 Severity Detection Competition. This task aims to distinguish the Mild, Moderate, Severe, and Critical grades in COVID-19 chest CT images. In our approach,...
-
Chapter and Conference Paper
BiTAT: Neural Network Binarization with Task-Dependent Aggregated Transformation
Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving ...
-
Chapter and Conference Paper
BadDet: Backdoor Attacks on Object Detection
Backdoor attack is a severe security threat which injects a backdoor trigger into a small portion of training data such that the trained model gives incorrect predictions when the specific trigger appears. Whi...
-
Chapter and Conference Paper
Hydra Attention: Efficient Attention with Many Heads
While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult. A large reason for this is that self-attention scales quadratically with the nu...
-
Chapter and Conference Paper
An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving
Object detection with high accuracy and fast inference speed based on camera sensors is important for autonomous driving. This paper develops a lightweight object detection network based on YOLOv5s which is on...
-
Chapter and Conference Paper
RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Place Recognition Network
Point cloud-based large scale place recognition is an important but challenging task for many applications such as Simultaneous Localization and Map** (SLAM). Taking the task as a point cloud retrieval probl...
-
Article
Prototype local–global alignment network for image–text retrieval
Image–text retrieval is a challenging task due to the requirement of thorough multimodal understanding and precise inter-modality relationship discovery. However, most previous approaches resort to doing globa...
-
Article
FCT: fusing CNN and transformer for scene classification
Scene classification based on convolutional neural networks (CNNs) has achieved great success in recent years. In CNNs, the convolution operation performs well in extracting local features, but its ability to ...
-
Article
Your heart rate betrays you: multimodal learning with spatio-temporal fusion networks for micro-expression recognition
Micro-expressions can convey feelings that people are trying to hide. At present, some studies on micro-expression, most of which only use the temporal or spatial information in the image to recognize micro-ex...
-
Article
TCKGE: Transformers with contrastive learning for knowledge graph embedding
Representation learning of knowledge graphs has emerged as a powerful technique for various downstream tasks. In recent years, numerous research efforts have been made for knowledge graphs embedding. However, ...
-
Article
Multi-aware coreference relation network for visual dialog
As a challenging cross-media task, visual dialog assesses whether an AI agent can converse in human language based on its understanding of visual content. So the critical issue is to pay attention not only to ...
-
Chapter and Conference Paper
Chair Design of Waiting Space in Maternity Department Based on QFD-Kano and FBS
In order to design the seats in the waiting space of the obstetrics and gynecology department of the hospital, reduce the anxiety of pregnant women waiting for the inspection process, objectively and rationall...
-
Chapter and Conference Paper
TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning
This paper presents a transformer framework for few-shot learning, termed TransVLAD, with one focus showing the power of locally aggregated descriptors for few-shot learning. Our TransVLAD model is simple: a s...