We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.
Filters applied:

Search Results

Showing 1-20 of 3,805 results
  1. Enhancing Semantics-Driven Recommender Systems with Visual Features

    Content-based semantics-driven recommender systems are often used in the small-scale news recommendation domain, founded on the TF-IDF measure but...
    Mounir M. Bendouch, Flavius Frasincar, Tarmo Robal in Advanced Information Systems Engineering
    Conference paper 2022
  2. FaSRnet: a feature and semantics refinement network for human pose estimation

    Due to factors such as motion blur, video out-of-focus, and occlusion, multi-frame human pose estimation is a challenging task. Exploiting temporal...

    Yuanhong Zhong, Qianfeng Xu, ... Shanshan Wang in Frontiers of Information Technology & Electronic Engineering
    Article 01 April 2024
  3. Enhancing Visual Question Answering with Generated Image Caption

    Visual Question Answering (VQA) poses a formidable challenge, necessitating computer systems to proficiently execute essential computer vision tasks,...
    Kieu-Anh Thi Truong, Truong-Thuy Tran, ... Duc-Trong Le in Computational Data and Social Networks
    Conference paper 2024
  4. A Cross-Modal View to Utilize Label Semantics for Enhancing Student Network in Multi-label Classification

    Knowledge transfer has become a promising approach for improving the performance and efficiency of relatively lightweight networks. Previous research...
    Yuzhuo Qin, Hengwei Liu, **aodong Gu in Artificial Neural Networks and Machine Learning – ICANN 2023
    Conference paper 2023
  5. Visual and language semantic hybrid enhancement and complementary for video description

    It is a fundamental task of computer vision to describe and express the visual content of a video in natural language, which not only highly...

    Pengjie Tang, Yunlan Tan, Wenlang Luo in Neural Computing and Applications
    Article 20 January 2022
  6. Zero-shot image classification via Visual–Semantic Feature Decoupling

    Zero-shot image classification refers to the use of labeled images to train a classification model that can correctly classify images of unseen...

    **n Sun, Yu Tian, Haojie Li in Multimedia Systems
    Article 15 March 2024
  7. Audio-Visual Segmentation by Leveraging Multi-scaled Features Learning

    Audio-visual segmentation with semantics (AVSS) is an advanced approach that enriches Audio-visual segmentation (AVS) by incorporating object...
    Sze An Peter Tan, Guangyu Gao, Jia Zhao in MultiMedia Modeling
    Conference paper 2024
  8. Enhancing Fairness of Visual Attribute Predictors

    The performance of deep neural networks for image recognition tasks such as predicting a smiling face is known to degrade with under-represented...
    Tobias Hänel, Nishant Kumar, ... Stefan Gumhold in Computer Vision – ACCV 2022
    Conference paper 2023
  9. Mutually guided learning of global semantics and local representations for image restoration

    The global semantics and the local scene representation are crucial for image restoration. Although existing methods have proposed various hybrid...

    Yuanshuo Cheng, Mingwen Shao, Yecong Wan in Multimedia Tools and Applications
    Article 14 September 2023
  10. Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

    Masked Autoencoders (MAE) have been popular paradigms for large-scale vision representation pre-training. However, MAE solely reconstructs the...

    Peng Gao, Ziyi Lin, ... Yu Qiao in International Journal of Computer Vision
    Article 24 November 2023
  11. Multi-granularity hypergraph-guided transformer learning framework for visual classification

    Fine-grained single-label classification tasks aim to distinguish highly similar categories but often overlook inter-category relationships....

    Jianjian Jiang, Ziwei Chen, ... **aochen Yuan in The Visual Computer
    Article 28 June 2024
  12. End-to-End Image Compression Through Machine Semantics

    With the increasing demand for AI automated analysis, machine semantics have replaced signals as a new focus in visual information compression. In...
    Jianran Liu, Chang Zhang, Wen Ji in Digital Multimedia Communications
    Conference paper 2024
  13. Lgvc: language-guided visual context modeling for 3D visual grounding

    3D visual grounding is crucial for understanding cross-modal scenes, linking visual objects to their corresponding language descriptions. Traditional...

    Liang Geng, Jianqin Yin, Yingchun Niu in Neural Computing and Applications
    Article 23 April 2024
  14. SLOD2+WIN: semantics-aware addition and LoD of 3D window details for LoD2 CityGML models with textures

    In many urban planning and visualization applications, it is crucial to have 3D window details. However, the process of acquiring and reconstructing...

    **ngzi Zhang, Kan Chen, ... Marius Erdt in The Visual Computer
    Article 15 March 2024
  15. Contrastive learning for unsupervised sentence embeddings using negative samples with diminished semantics

    Unsupervised learning has made significant progress in recent years, driven by advancements in contrastive learning. However, current methods for...

    Zhiyi Yu, Hong Li, Jialin Feng in The Journal of Supercomputing
    Article 27 September 2023
  16. GViG: Generative Visual Grounding Using Prompt-Based Language Modeling for Visual Question Answering

    The WSDM 2023 Toloka VQA challenge introduces a new Grounding-based Visual Question Answering (GVQA) dataset, elevating multimodal task complexity....
    Yi-Ting Li, Ying-Jia Lin, ... Hung-Yu Kao in Advances in Knowledge Discovery and Data Mining
    Conference paper 2024
  17. MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval

    Dominant pre-training work for video-text retrieval mainly adopt the “dual-encoder” architectures to enable efficient retrieval, where two separate...
    Yuying Ge, Yixiao Ge, ... ** Luo in Computer Vision – ECCV 2022
    Conference paper 2022
  18. Unimodal-Multimodal Collaborative Enhancement for Audio-Visual Event Localization

    Audio-visual event localization (AVE) task focuses on localizing audio-visual events where event signals occur in both audio and visual modalities....
    Huilin Tian, **gke Meng, ... Weishi Zheng in Pattern Recognition and Computer Vision
    Conference paper 2024
  19. An Effective Pre-trained Visual Encoder for Medical Visual Question Answering

    Medical Visual Question Answering (Med-VQA) is a domain-specific task that answers a given clinical question regarding a radiology image. It requires...
    Yefan Huang, **aoli Wang, **song Su in Advanced Data Mining and Applications
    Conference paper 2023
  20. Indirect visual–semantic alignment for generalized zero-shot recognition

    Our paper addresses the challenge of generalized zero-shot learning, where the label of a target image may belong to either a seen or an unseen...

    Yan-He Chen, Mei-Chen Yeh in Multimedia Systems
    Article 03 April 2024
Did you find what you were looking for? Share feedback.