Search Page | SpringerLink

Align vision-language semantics by multi-task learning for multi-modal summarization

Most current multi-modal summarization methods follow a cascaded manner, where an off-the-shelf object detector is first used to extract visual...

Chenhao Cui, **nnian Liang, ... Zhoujun Li in Neural Computing and Applications

Article 17 May 2024

MutualFormer: Multi-modal Representation Learning via Cross-Diffusion Attention

Aggregating multi-modal data to obtain reliable data representation attracts more and more attention. Recent studies demonstrate that Transformer...

**xi Wang, **ao Wang, ... Bin Luo in International Journal of Computer Vision

Article 24 April 2024

Modelling flight trajectories with multi-modal generative adversarial imitation learning

Models of aircraft trajectories become important components of systems supporting the trajectory based operations paradigm: trajectory predictability...

Christos Spatharis, Konstantinos Blekas, George A. Vouros in Applied Intelligence

Article 03 June 2024

Multi-modal Hash Learning Efficient Multimedia Retrieval and Recommendations

This book systemically presents key concepts of multi-modal hashing technology, recent advances on large-scale efficient multimedia search and...

Lei Zhu, **g**g Li, Weili Guan in Synthesis Lectures on Information Concepts, Retrieval, and Services

Book 2024

PointCMC: cross-modal multi-scale correspondences learning for point cloud understanding

Existing cross-modal frameworks have achieved impressive performance in point cloud object representations learning, where a 2D image encoder is...

Honggu Zhou, **aogang Peng, ... Zizhao Wu in Multimedia Systems

Article 30 April 2024

UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-modal Learning

The emergence of large-scale high-quality datasets has stimulated the rapid development of deep learning in recent years. However, most computer...

Xue-Feng Zhu, Tianyang Xu, ... Josef Kittler in International Journal of Computer Vision

Article 22 February 2024

UaMC: user-augmented conversation recommendation via multi-modal graph learning and context mining

Conversation Recommender System (CRS) engage in multi-turn conversations with users and provide recommendations through responses. As user...

Siqi Fan, Yequan Wang, ... Shuo Shang in World Wide Web

Article 01 November 2023

MMPL-Net: multi-modal prototype learning for one-shot RGB-D segmentation

For one-shot segmentation, prototype learning is extensively used. However, using only one RGB prototype to represent all information in the support...

Dexing Shan, Yunzhou Zhang, ... Dermot Kerr in Neural Computing and Applications

Article 28 February 2023

Federated learning inspired privacy sensitive emotion recognition based on multi-modal physiological sensors

Traditional machine learning classifiers can automatically evaluate human behaviour and emotion recognition tasks. However, prior research work does...

Neha Gahlan, Divyashikha Sethia in Cluster Computing

Article 24 September 2023

Bayesian mixture variational autoencoders for multi-modal learning

This paper provides an in-depth analysis on how to effectively acquire and generalize cross-modal knowledge for multi-modal learning....

Keng-Te Liao, Bo-Wei Huang, ... Shou-De Lin in Machine Learning

Article 07 November 2022

Human Gait Recognition Based on Frontal-View Walking Sequences Using Multi-modal Feature Representations and Learning

Despite that much progress has been reported in gait recognition, most of these existing works adopt lateral-view parameters as gait features, which...

Muqing Deng, Zebang Zhong, ... Junrong Liao in Neural Processing Letters

Article Open access 02 April 2024

Collaboration based multi-modal multi-label learning

Complex objects can be represented as multiple modal features and associated with multiple labels. The major challenge of complex object...

Yi Zhang, Yinlong Zhu, ... Chongjung Wang in Applied Intelligence

Article 04 March 2022

Deep learning based object detection from multi-modal sensors: an overview

Object detection is an important problem and has a wide range of applications. In recent years, deep learning based object detection with...

Ye Liu, Shiyang Meng, ... Jun Liu in Multimedia Tools and Applications

Article 28 July 2023

CMC-MMR: multi-modal recommendation model with cross-modal correction

Multi-modal recommendation using multi-modal features (e.g., image and text features) has received significant attention and has been shown to have...

YuBin Wang, HongBin **a, Yuan Liu in Journal of Intelligent Information Systems

Article 20 February 2024

SSDMM-VAE: variational multi-modal disentangled representation learning

Multi-modal learning aims at simultaneously modelling data from several modalities such as image, text and speech. The goal is to simultaneously...

Arnab Kumar Mondal, Ajay Sailopal, ... Prathosh AP in Applied Intelligence

Article 21 July 2022

Lightweight Multi-modal Representation Learning for RGB Salient Object Detection

The task of salient object detection (SOD) often faces various challenges such as complex backgrounds and low appearance contrast. Depth information,...

Yun **ao, Yameng Huang, ... ** Tang in Cognitive Computation

Article 02 June 2023

MADMM: microservice system anomaly detection via multi-modal data and multi-feature extraction

Accurately detecting anomalies in microservice systems is crucial to avoid system failures and economic losses for users. Existing approaches detect...

Peipeng Wang, **uguo Zhang, ... Zihan Chen in Neural Computing and Applications

Article 18 May 2024

Re-transfer learning and multi-modal learning assisted early diagnosis of Alzheimer’s disease

Nowadays more and more elderly people are suffering from Alzheimer’s disease (AD). Finely recognizing mild cognitive impairment (MCI) in early stage...

Meie Fang, Zhuxin **, ... Zhigeng Pan in Multimedia Tools and Applications

Article 02 April 2022