![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
63,265 Result(s)
-
Chapter and Conference Paper
Abstracts Embeddings Evaluation: A Case Study of Artificial Intelligence and Medical Imaging for the COVID-19 Infection
During the COVID-19 pandemic, a huge amount of literature was produced covering different aspects of infection. The use of artificial intelligence (AI) in medical imaging has been shown to improve screening, d...
-
Chapter and Conference Paper
Video-Based Emotion Estimation Using Deep Neural Networks: A Comparative Study
In this study we investigate the effectiveness of deep neural networks in predicting valence and arousal solely from visual information of video sequences. Several recent Convolutional Neural Network (CNN) and...
-
Chapter and Conference Paper
On-Device Learning with Binary Neural Networks
Existing Continual Learning (CL) solutions only partially address the constraints on power, memory and computation of the deep learning models when deployed on low-power embedded CPUs. In this paper, we propos...
-
Chapter and Conference Paper
Multi-level Patch Transformer for Style Transfer with Single Reference Image
Despite the recent success of image style transfer with Generative Adversarial Networks (GANs), this task remains challenging due to the requirements of large volumes of style image data. In this work, we pres...
-
Chapter and Conference Paper
Correlation Analysis Between Insomnia Severity and Depressive Symptoms of College Students Based on Pseudo-Siamese Network
To explore the correlation between emotional mood and sleep quality in a college student population, we propose a new method based on pseudo-siamese network, which can quickly diagnose the causes of depression...
-
Chapter and Conference Paper
Ookpik- A Collection of Out-of-Context Image-Caption Pairs
The development of AI-based Cheapfakes detection models has been hindered by a significant challenge - the scarcity of real-world datasets. Our work directly tackles this issue by focusing on out-of-context (O...
-
Chapter and Conference Paper
Streaming Graph-Based Supervoxel Computation Based on Dynamic Iterative Spanning Forest
Streaming video segmentation decreases processing time by creating supervoxels taking into account small parts of the video instead of using all video content. Thanks to the good performance of the Iterative S...
-
Chapter and Conference Paper
Improving Small License Plate Detection with Bidirectional Vehicle-Plate Relation
License plate detection is a critical component of license plate recognition systems. A challenge in this domain is detecting small license plates captured at a considerable distance. Previous researchers have...
-
Chapter and Conference Paper
Lightweight Image Captioning Model Based on Knowledge Distillation
The performance of image captioning models based on deep learning has been significantly improved compared with traditional algorithms. However, due to the complex network structure and huge parameters, these ...
-
Chapter and Conference Paper
Deformable CNN with Position Encoding for Arbitrary-Scale Super-Resolution
Implicit neural representation (INR) has been widely used to learn continuous representation of images, as it enables arbitrary-scale super-resolution (SR). However, most existing INR-based arbitrary-scale SR ...
-
Chapter and Conference Paper
Analysis and Impact of Training Set Size in Cross-Subject Human Activity Recognition
The ubiquity of consumer devices with sensing and computational capabilities, such as smartphones and smartwatches, has increased interest in their use in human activity recognition for healthcare monitoring a...
-
Chapter and Conference Paper
Self-supervised Monocular Depth Estimation on Unseen Synthetic Cameras
Monocular depth estimation is a critical task in computer vision, and self-supervised deep learning methods have achieved remarkable results in recent years. However, these models often struggle on camera gene...
-
Chapter and Conference Paper
A Detail-Guided Multi-source Fusion Network for Remote Sensing Object Detection
Optical and synthetic aperture radar (SAR) remote sensing have established themselves as valuable tools for object detection. Optical images exhibit weather-dependence but offer intricate information, whereas ...
-
Chapter and Conference Paper
Multi-task Collaborative Network for Image-Text Retrieval
Image-text retrieval aims to capture semantic relevance between images and texts. Most existing approaches rely solely on the image-text pairs to learn visual-semantic representation through fine-grained align...
-
Chapter and Conference Paper
Semantic Importance-Based Deep Image Compression Using a Generative Approach
Semantic image compression can greatly reduce the amount of transmitted data by representing and reconstructing images using semantic information. Considering the fact that objects in an image are not equally ...
-
Chapter and Conference Paper
MoPE: Mixture of Pooling Experts Framework for Image-Text Retrieval
Image-text retrieval is a fundamental and crucial task in the field of multimodal interaction, which assists internet users in retrieving the required visual and textual information conveniently. The dominant ...
-
Chapter and Conference Paper
A Systematic Review of Multimodal Deep Learning Approaches for COVID-19 Diagnosis
During and after the years of the COVID-19 pandemic, researchers and domain experts put all their effort into the discovery of accurate and reliable techniques for the detection and diagnosis of this disease i...
-
Chapter and Conference Paper
Self-supervised Edge Structure Learning for Multi-view Stereo and Parallel Optimization
Recent studies have witnessed that many self-supervised methods obtain clear progress on the multi-view stereo (MVS). However, existing methods ignore the edge structure information of the reconstructed target...
-
Chapter and Conference Paper
Lightweight Rolling Shutter Image Restoration Network Based on Undistorted Flow
Rolling shutter(RS) cameras are widely used in fields such as drone photography and robot navigation. However, when shooting a fast-moving target, the captured image may be distorted and blurred due to the fea...
-
Chapter and Conference Paper
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites
Large language models (LLMs) have shown remarkable performance in natural language processing (NLP) tasks. To comprehend and execute diverse human instructions over image data, instruction-tuned large vision-l...