We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.

Search Results

Showing 61-80 of 10,000 results
  1. MoPE: Mixture of Pooling Experts Framework for Image-Text Retrieval

    Image-text retrieval is a fundamental and crucial task in the field of multimodal interaction, which assists internet users in retrieving the...
    Jiangfeng Li, Bowen Wang, ... Qinpei Zhao in MultiMedia Modeling
    Conference paper 2024
  2. Differentiable Neural Architecture Search Based on Efficient Architecture for Lightweight Image Super-Resolution

    With the advancement of deep neural networks, image Super-Resolution (SR) has witnessed remarkable improvements in performance. However, the...
    Chunyin Sheng, **ang Gao, ... Fan Wang in MultiMedia Modeling
    Conference paper 2024
  3. A Language-Based Solution to Enable Metaverse Retrieval

    Recently, the Metaverse is becoming increasingly attractive, with millions of users accessing the many available virtual worlds. However, how do...
    Ali Abdari, Alex Falcon, Giuseppe Serra in MultiMedia Modeling
    Conference paper 2024
  4. Sustainable Commercial Fishery Control Using Multimedia Forensics Data from Non-trusted, Mobile Edge Nodes

    Uncontrolled over-fishing has been exemplified by the UN as a serious ecological challenge and a major threat to sustainable food supplies. Emerging...
    Aril Bernhard Ovesen, Tor-Arne Schmidt Nordmo, ... Dag Johansen in MultiMedia Modeling
    Conference paper 2024
  5. A Region Based Non-overlap** Reference Speech Estimation Method for Speaker Extraction

    Speaker extraction is a technique that separates the target speech from multi-talker mixtures using a priori information about the target speaker,...
    Yiru Zhang, Zeke Li, ... Qun Yang in MultiMedia Modeling
    Conference paper 2024
  6. Multi-modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation

    Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks. Given...
    Linzi **ng, Quan Tran, ... Giuseppe Carenini in MultiMedia Modeling
    Conference paper 2024
  7. Audio-Visual Segmentation by Leveraging Multi-scaled Features Learning

    Audio-visual segmentation with semantics (AVSS) is an advanced approach that enriches Audio-visual segmentation (AVS) by incorporating object...
    Sze An Peter Tan, Guangyu Gao, Jia Zhao in MultiMedia Modeling
    Conference paper 2024
  8. SM-GAN: Single-Stage and Multi-object Text Guided Image Editing

    In recent years, text-guided scene image manipulation has received extensive attention in the computer vision community. Most of the existing...
    Ruichen Li, Lei Wu, ... Minggang He in MultiMedia Modeling
    Conference paper 2024
  9. A Secure and Fair Federated Learning Protocol Under the Universal Composability Framework

    Federated learning is a paradigm of distributed machine learning that enables multiple participants to collaboratively train a global model while...
    Li Qiuxian, Zhou Quanxing, Ding Hongfa in MultiMedia Modeling
    Conference paper 2024
  10. NearbyPatchCL: Leveraging Nearby Patches for Self-supervised Patch-Level Multi-class Classification in Whole-Slide Images

    Whole-slide image (WSI) analysis plays a crucial role in cancer diagnosis and treatment. In addressing the demands of this critical task,...
    Gia-Bao Le, Van-Tien Nguyen, ... Minh-Triet Tran in MultiMedia Modeling
    Conference paper 2024
  11. Joint Image Data Hiding and Rate-Distortion Optimization in Neural Compressed Latent Representations

    We present an end-to-end learned image data hiding framework that embeds and extracts secrets in the latent representations of a neural compressor....
    Chen-Hsiu Huang, Ja-Ling Wu in MultiMedia Modeling
    Conference paper 2024
  12. Hierarchical Supervised Contrastive Learning for Multimodal Sentiment Analysis

    Multimodal sentiment analysis (MSA) is dedicated to deciphering human emotions in videos. It is a challenging task due to the semantic disparities...
    Kezhou Chen, Shuo Wang, Yanbin Hao in MultiMedia Modeling
    Conference paper 2024
  13. Semantic Importance-Based Deep Image Compression Using a Generative Approach

    Semantic image compression can greatly reduce the amount of transmitted data by representing and reconstructing images using semantic information....
    ** Gu, Yuanyuan Xu, Kun Zhu in MultiMedia Modeling
    Conference paper 2024
  14. SEAS-Net: Segment Exchange Augmentation for Semi-supervised Brain Tumor Segmentation

    Accurate segmentation of brain tumors is crucial for cancer diagnosis, treatment planning, and evaluation. However, semi-supervised brain tumor image...
    **g Zhang, Wei Wu in MultiMedia Modeling
    Conference paper 2024
  15. MRHF: Multi-stage Retrieval and Hierarchical Fusion for Textbook Question Answering

    Textbook question answering is challenging as it aims to automatically answer various questions on textbook lessons with long text and complex...
    Peide Zhu, Zhen Wang, ... Jie Yang in MultiMedia Modeling
    Conference paper 2024
  16. Multi-task Collaborative Network for Image-Text Retrieval

    Image-text retrieval aims to capture semantic relevance between images and texts. Most existing approaches rely solely on the image-text pairs to...
    Xueyang Qin, Lishuang Li, ... Guangyao Pang in MultiMedia Modeling
    Conference paper 2024
  17. LigCDnet:Remote Sensing Image Cloud Detection Based on Lightweight Framework

    Cloud contamination is inevitable in remote sensing images, resulting in a large number of images that cannot be applied in various fields....
    Baotong Su, Wenguang Zheng in MultiMedia Modeling
    Conference paper 2024
  18. Unsupervised Multi-collaborative Learning Network for 3D Face Reconstruction

    Monocular image-based 3D fine face reconstruction techniques aim to reconstruct 3D faces with rich face details from a single image. Existing methods...
    Wenlong Lu, Su** Wu, ... Shengjia Zhang in MultiMedia Modeling
    Conference paper 2024
  19. FGENet: Fine-Grained Extraction Network for Congested Crowd Counting

    Crowd counting has gained significant popularity due to its practical applications. However, mainstream counting methods ignore precise individual...
    Hao-Yuan Ma, Li Zhang, **ang-Yi Wei in MultiMedia Modeling
    Conference paper 2024
  20. Pseudo-label Based Unsupervised Momentum Representation Learning for Multi-domain Image Retrieval

    Although many current cross-domain image retrieval researches have made good progress, most of the works is targeted at specific domains. At the same...
    Mingyuan Ge, Jianan Shui, ... Mingyong Li in MultiMedia Modeling
    Conference paper 2024
Did you find what you were looking for? Share feedback.