![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
248 Result(s)
-
Article
A new virtual interpolation technology with range as object
Virtual interpolation technology can be applied to direction-of-arrival (DOA) estimation as a preprocessing technique to achieve the DOA estimation for any array. In order to solve the angle-sensitive problem ...
-
Article
Hugs Bring Double Benefits: Unsupervised Cross-Modal Hashing with Multi-granularity Aligned Transformers
Unsupervised cross-modal hashing (UCMH) has been commonly explored to support large-scale cross-modal retrieval of unlabeled data. Despite promising progress, most existing approaches are developed on convolut...
-
Article
Softmax-Free Linear Transformers
Vision transformers (ViTs) have pushed the state-of-the-art for visual perception tasks. The self-attention mechanism underpinning the strength of ViTs has a quadratic complexity in both computation and memory...
-
Article
Does Confusion Really Hurt Novel Class Discovery?
When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research,...
-
Article
DML-YOLOv8-SAR image object detection algorithm
Given the challenges posed by noise and varying target scales in SAR images, conventional convolutional neural networks often underperform in SAR image detection. To address this, this paper introduces a novel...
-
Article
Using improved YOLO V5s to recognize tomatoes in a continuous working environment
In the continuous working environment of the picking robots, factors such as illumination change, camera hardware, the movement of the picking robots, and image background interference have a great impact on t...
-
Article
Diff-Font: Diffusion Model for Robust One-Shot Font Generation
Font generation presents a significant challenge due to the intricate details needed, especially for languages with complex ideograms and numerous characters, such as Chinese and Korean. Although various few-s...
-
Article
Grounded Affordance from Exocentric View
Affordance grounding aims to locate objects’ “action possibilities” regions, an essential step toward embodied intelligence. Due to the diversity of interactive affordance, i.e., the uniqueness of different indiv...
-
Article
Shuff-BiseNet: a dual-branch segmentation network for pavement cracks
In order to accurately obtain the shape and size of pavement cracks, analyze the severity of pavement cracks, avoid deterioration of the situation, and take timely measures, we proposed a dual-branch structure...
-
Article
Delving into Identify-Emphasize Paradigm for Combating Unknown Bias
Dataset biases are notoriously detrimental to model robustness and generalization. The identify-emphasize paradigm appears to be effective in dealing with unknown biases. However, we discover that it is still ...
-
Article
Towards Defending Multiple \(\ell _p\) -Norm Bounded Adversarial Perturbations via Gated Batch Normalization
There has been extensive evidence demonstrating that deep neural networks are vulnerable to adversarial examples, which motivates the development of defenses against adversarial attacks. Existing adversarial d...
-
Article
Research on image caption generation method based on multi-modal pre-training model and text mixup optimization
In recent years, multi-modal pre-training models have demonstrated remarkable cross-modal representation capabilities, catalyzing the rapid evolution of multi-modal downstream tasks, particularly in image capt...
-
Article
GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions
Image restoration in adverse weather conditions is a difficult task in computer vision. In this paper, we propose a novel transformer-based framework called GridFormer which serves as a backbone for image rest...
-
Article
YOLO-MTG: a lightweight YOLO model for multi-target garbage detection
With wide adoption of deep learning technology in AI, intelligent garbage detection has become a hot research topic. However, existing datasets currently used for garbage detection rarely involves multi-catego...
-
Article
Convex–Concave Tensor Robust Principal Component Analysis
Tensor robust principal component analysis (TRPCA) aims at recovering the underlying low-rank clean tensor and residual sparse component from the observed tensor. The recovery quality heavily depends on the de...
-
Article
Diagram Perception Networks for Textbook Question Answering via Joint Optimization
Textbook question answering requires a system to answer questions with or without diagrams accurately, given multimodal contexts that include rich paragraphs and diagrams. Existing methods usually utilize a pi...
-
Article
Robust Unpaired Image Dehazing via Density and Depth Decomposition
To overcome the overfitting issue of dehazing models trained on synthetic hazy-clean image pairs, recent methods attempt to boost the generalization ability by training on unpaired data. However, most of exist...
-
Article
An effective masked transformer network for image denoising
The rising popularity of employing deep learning networks for image denoising can be observed over the past decade. Typically, their exceptional performance is rooted in their ability to learn the map** from...
-
Article
VNAS: Variational Neural Architecture Search
Differentiable neural architecture search delivers point estimation to the optimal architecture, which yields arbitrarily high confidence to the learned architecture. This approach thus suffers in calibration ...
-
Article
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical ...