Search
Search Results
-
Vision transformer models for mobile/edge devices: a survey
With the rapidly growing demand for high-performance deep learning vision models on mobile and edge devices, this paper emphasizes the importance of...
-
Masked Vision-language Transformer in Fashion
We present a masked vision-language transformer (MVLT) for fashion-specific multi-modal representation. Technically, we simply utilize the vision...
-
Optimized vision transformer encoder with cnn for automatic psoriasis disease detection
Psoriasis is a skin disorder that results in swollen skin cells and red, itchy areas on the skin. 40% of the world's population is currently affected...
-
Training Object Detectors from Scratch: An Empirical Study in the Era of Vision Transformer
Modeling in computer vision has long been dominated by convolutional neural networks (CNNs). Recently, in light of the excellent performance of...
-
Vision transformer with multiple granularities for person re-identification
Extracting discriminative features using vision transformer is a popular research direction for person re-identification. However, feature extraction...
-
A survey of the vision transformers and their CNN-transformer based variants
Vision transformers have become popular as a possible substitute to convolutional neural networks (CNNs) for a variety of computer vision...
-
Underwater image enhancement using lightweight vision transformer
Deep learning-based models have recently shown a strong potential in Underwater Image Enhancement (UIE) that are satisfying and have the right colors...
-
Self-supervised approach for diabetic retinopathy severity detection using vision transformer
Diabetic retinopathy (DR) is a diabetic condition that affects vision, despite the great success of supervised learning and Conventional Neural...
-
Super Vision Transformer
We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number. We present a novel...
-
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary...
-
Exploring vision transformer: classifying electron-microscopy pollen images with transformer
Pollen identification is a sub-discipline of Palynology, which has broad applications in several fields such as allergy control, paleoclimate...
-
Ctnet: rethinking convolutional neural networks and vision transformer for medical image segmentation
Convolutional architectures have demonstrated remarkable success in various vision tasks, offering efficient learning through their inherent...
-
Strawberry disease identification with vision transformer-based models
Strawberry is a healthy, beneficial fruit and one of the most valuable exports for most countries. However, diseases could produce poor-quality...
-
Efficient deepfake detection using shallow vision transformer
Deepfake is a deep learning-based technique that generates fake face images by mimicking the distribution of original images. Deepfake images can be...
-
ViT-PGC: vision transformer for pedestrian gender classification on small-size dataset
Pedestrian gender classification (PGC) is a key task in full-body-based pedestrian image analysis and has become an important area in applications...
-
GFPE-ViT: vision transformer with geometric-fractal-based position encoding
In recent years, transformers have become a significant tool in computer vision, revolutionizing fundamental tasks. This paper focuses on the map**...
-
Vision transformer and its variants for image classification in digital breast cancer histopathology: a comparative study
Convolutional Neural Networks (CNNs) have been the most popular image classification tool for a long time. Inspired by the greater success of the...
-
Image recoloring for color vision deficiency compensation using Swin transformer
People with color vision deficiency (CVD) have difficulty in distinguishing differences between colors. To compensate for the loss of color contrast...
-
HELViT: highly efficient lightweight vision transformer for remote sensing image scene classification
Remote sensing image scene classification methods based on convolutional neural networks (CNN) have been extremely successful. However, the...
-
Static hand gesture recognition method based on the Vision Transformer
Hand gesture recognition (HGR) is the most important part of human-computer interaction (HCI). Static hand gesture recognition is equivalent to the...