Skip to main content

previous disabled Page of 91
and
  1. No Access

    Article

    Progressive spatial–temporal transfer model for unsupervised person re-identification

    Over the past decade, a more widespread area of computer vision research has been person re-identification (P-Reid). This technology is applied in fields such as pedestrian tracking, security, and video survei...

    Shuren Zhou, Zhixiong Li, Jie Liu in International Journal of Multimedia Inform… (2024)

  2. No Access

    Article

    Parameter-efficient tuning of cross-modal retrieval for a specific database via trainable textual and visual prompts

    A novel cross-modal image retrieval method realized by parameter efficiently tuning a pre-trained cross-modal model is proposed in this study. Conventional cross-modal retrieval methods realize text-to-image r...

    Huaying Zhang, Rintaro Yanagi, Ren Togo in International Journal of Multimedia Inform… (2024)

  3. No Access

    Article

    PSNet: position-shift alignment network for image caption

    Recently, Transformer-based models have gained increasing popularity in the field of image captioning. The global attention mechanism of the Transformer facilitates the integration of region and grid features,...

    Lixia Xue, Awen Zhang, Ronggui Wang in International Journal of Multimedia Inform… (2023)

  4. No Access

    Article

    A lightweight small object detection algorithm based on improved YOLOv5 for driving scenarios

    Small object detection has been a longstanding challenge in the field of object detection, and achieving high detection accuracy is crucial for autonomous driving, especially for small objects. This article fo...

    Zonghui Wen, Jia Su, Yongxiang Zhang in International Journal of Multimedia Inform… (2023)

  5. No Access

    Article

    CoCoOpter: Pre-train, prompt, and fine-tune the vision-language model for few-shot image classification

    Few-shot image classification aims at learning to generalize to unseen new categories from a few training samples. Transfer learning is one prominent approach to the task, which first learns a backbone from th...

    Jie Yan, Yuxiang **e, Yanming Guo in International Journal of Multimedia Inform… (2023)

  6. No Access

    Article

    Style-aware adversarial pairwise ranking for image recommendation systems

    The vulnerability of Machine Learning (ML) models to adversarial attack and their prominence pose security issues, notably in image recommendation systems. The adversarial training method is an excellent strat...

    Zhefu Wu, Song Zhang, Agyemang Paul in International Journal of Multimedia Inform… (2023)

  7. No Access

    Chapter and Conference Paper

    CMC_v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors

    This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022). In our approach, we employ the win...

    Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang in Computer Vision – ECCV 2022 Workshops (2023)

  8. No Access

    Chapter and Conference Paper

    Boosting COVID-19 Severity Detection with Infection-Aware Contrastive Mixup Classification

    This paper presents our solution for the 2nd COVID-19 Severity Detection Competition. This task aims to distinguish the Mild, Moderate, Severe, and Critical grades in COVID-19 chest CT images. In our approach,...

    Junlin Hou, Jilan Xu, Nan Zhang, Yuejie Zhang in Computer Vision – ECCV 2022 Workshops (2023)

  9. No Access

    Chapter and Conference Paper

    BiTAT: Neural Network Binarization with Task-Dependent Aggregated Transformation

    Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving ...

    Geon Park, Jaehong Yoon, Haiyang Zhang, **ng Zhang in Computer Vision – ECCV 2022 Workshops (2023)

  10. No Access

    Chapter and Conference Paper

    BadDet: Backdoor Attacks on Object Detection

    Backdoor attack is a severe security threat which injects a backdoor trigger into a small portion of training data such that the trained model gives incorrect predictions when the specific trigger appears. Whi...

    Shih-Han Chan, Yinpeng Dong, Jun Zhu, **aolu Zhang in Computer Vision – ECCV 2022 Workshops (2023)

  11. No Access

    Chapter and Conference Paper

    Hydra Attention: Efficient Attention with Many Heads

    While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult. A large reason for this is that self-attention scales quadratically with the nu...

    Daniel Bolya, Cheng-Yang Fu, **aoliang Dai in Computer Vision – ECCV 2022 Workshops (2023)

  12. No Access

    Chapter and Conference Paper

    An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

    Object detection with high accuracy and fast inference speed based on camera sensors is important for autonomous driving. This paper develops a lightweight object detection network based on YOLOv5s which is on...

    Guofa Li, Yingjie Zhang, Delin Ouyang, **ngda Qu in Computer Vision – ECCV 2022 Workshops (2023)

  13. No Access

    Chapter and Conference Paper

    RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Place Recognition Network

    Point cloud-based large scale place recognition is an important but challenging task for many applications such as Simultaneous Localization and Map** (SLAM). Taking the task as a point cloud retrieval probl...

    Zhaoxin Fan, Zhenbo Song, Wen** Zhang in Computer Vision – ECCV 2022 Workshops (2023)

  14. No Access

    Article

    Prototype local–global alignment network for image–text retrieval

    Image–text retrieval is a challenging task due to the requirement of thorough multimodal understanding and precise inter-modality relationship discovery. However, most previous approaches resort to doing globa...

    Lingtao Meng, Feifei Zhang, ** Zhang in International Journal of Multimedia Inform… (2022)

  15. No Access

    Article

    FCT: fusing CNN and transformer for scene classification

    Scene classification based on convolutional neural networks (CNNs) has achieved great success in recent years. In CNNs, the convolution operation performs well in extracting local features, but its ability to ...

    Yuxiang **e, Jie Yan, Lai Kang, Yanming Guo in International Journal of Multimedia Inform… (2022)

  16. No Access

    Article

    Your heart rate betrays you: multimodal learning with spatio-temporal fusion networks for micro-expression recognition

    Micro-expressions can convey feelings that people are trying to hide. At present, some studies on micro-expression, most of which only use the temporal or spatial information in the image to recognize micro-ex...

    Ren Zhang, Ning He, Shengjie Liu, Ying Wu in International Journal of Multimedia Inform… (2022)

  17. No Access

    Article

    TCKGE: Transformers with contrastive learning for knowledge graph embedding

    Representation learning of knowledge graphs has emerged as a powerful technique for various downstream tasks. In recent years, numerous research efforts have been made for knowledge graphs embedding. However, ...

    **aowei Zhang, Quan Fang, Jun Hu in International Journal of Multimedia Inform… (2022)

  18. No Access

    Article

    Multi-aware coreference relation network for visual dialog

    As a challenging cross-media task, visual dialog assesses whether an AI agent can converse in human language based on its understanding of visual content. So the critical issue is to pay attention not only to ...

    Zefan Zhang, Tianling Jiang, Chun** Liu in International Journal of Multimedia Inform… (2022)

  19. No Access

    Chapter and Conference Paper

    Chair Design of Waiting Space in Maternity Department Based on QFD-Kano and FBS

    In order to design the seats in the waiting space of the obstetrics and gynecology department of the hospital, reduce the anxiety of pregnant women waiting for the inspection process, objectively and rationall...

    Lin Zhang, Zhang Zhang in HCI International 2022 Posters (2022)

  20. No Access

    Chapter and Conference Paper

    TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning

    This paper presents a transformer framework for few-shot learning, termed TransVLAD, with one focus showing the power of locally aggregated descriptors for few-shot learning. Our TransVLAD model is simple: a s...

    Haoquan Li, Laoming Zhang, Daoan Zhang, Lang Fu, Peng Yang in Computer Vision – ECCV 2022 (2022)

previous disabled Page of 91