Skip to main content

previous disabled Page of 36
and
  1. No Access

    Article

    Hugs Bring Double Benefits: Unsupervised Cross-Modal Hashing with Multi-granularity Aligned Transformers

    Unsupervised cross-modal hashing (UCMH) has been commonly explored to support large-scale cross-modal retrieval of unlabeled data. Despite promising progress, most existing approaches are developed on convolut...

    **peng Wang, Ziyun Zeng, Bin Chen, Yuting Wang in International Journal of Computer Vision (2024)

  2. No Access

    Article

    Softmax-Free Linear Transformers

    Vision transformers (ViTs) have pushed the state-of-the-art for visual perception tasks. The self-attention mechanism underpinning the strength of ViTs has a quadratic complexity in both computation and memory...

    Jiachen Lu, Junge Zhang, **atian Zhu in International Journal of Computer Vision (2024)

  3. No Access

    Article

    Does Confusion Really Hurt Novel Class Discovery?

    When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research,...

    Haoang Chi, Wen**g Yang, Feng Liu, Long Lan in International Journal of Computer Vision (2024)

  4. No Access

    Article

    A full-detection association tracker with confidence optimization for real-time multi-object tracking

    Multi-object tracking (MOT) aims to obtain trajectories with unique identifiers for multiple objects in a video stream. In current approaches, confidence thresholds were frequently used to perform multi-stage ...

    Youyu Liu, **angxiang Zhou, Zhendong Zhang, Yi Li in Journal of Real-Time Image Processing (2024)

  5. No Access

    Article

    MFMANet: a multispectral pedestrian detection network using multi-resolution RGB feature reuse with multi-scale FIR attentions

    In the realm of multispectral pedestrian detection, especially under challenging low-illumination, the existing methods, characterized by cross-modality feature interaction, lack generalization and are hard to...

    Jiaren Guo, Yuzhen Zhang, Jianyin Zheng, Zihao Huang in Machine Vision and Applications (2024)

  6. No Access

    Article

    Diff-Font: Diffusion Model for Robust One-Shot Font Generation

    Font generation presents a significant challenge due to the intricate details needed, especially for languages with complex ideograms and numerous characters, such as Chinese and Korean. Although various few-s...

    Haibin He, **nyuan Chen, Chaoyue Wang in International Journal of Computer Vision (2024)

  7. No Access

    Article

    A hardware-friendly logarithmic quantization method for CNNs and FPGA implementation

    Convolutional Neural Networks (CNNs) have been widely used in various fields due to their high accuracy and efficiency. The performance of CNNs is mainly affected by the computing capability, memory bandwidth,...

    Tao Jiang, Ligang **ng, **ming Yu, Junchao Qian in Journal of Real-Time Image Processing (2024)

  8. No Access

    Article

    Grounded Affordance from Exocentric View

    Affordance grounding aims to locate objects’ “action possibilities” regions, an essential step toward embodied intelligence. Due to the diversity of interactive affordance, i.e., the uniqueness of different indiv...

    Hongchen Luo, Wei Zhai, **g Zhang, Yang Cao in International Journal of Computer Vision (2024)

  9. No Access

    Article

    Delving into Identify-Emphasize Paradigm for Combating Unknown Bias

    Dataset biases are notoriously detrimental to model robustness and generalization. The identify-emphasize paradigm appears to be effective in dealing with unknown biases. However, we discover that it is still ...

    Bowen Zhao, Chen Chen, Qian-Wei Wang, Anfeng He in International Journal of Computer Vision (2024)

  10. No Access

    Article

    Towards Defending Multiple \(\ell _p\) -Norm Bounded Adversarial Perturbations via Gated Batch Normalization

    There has been extensive evidence demonstrating that deep neural networks are vulnerable to adversarial examples, which motivates the development of defenses against adversarial attacks. Existing adversarial d...

    Aishan Liu, Shiyu Tang, **nyun Chen, Lei Huang in International Journal of Computer Vision (2024)

  11. No Access

    Article

    EEA-Net: edge-enhanced assistance network for infrared small target detection

    With the development of deep learning, the performance of infrared small target detection (IRSTD) has been significantly improved. A precise shape of the target edge is crucial for segmenting small infrared ta...

    Chen Wang, **aopeng Hu, **ang Gao, Haoyu Wei, Jiawei Tao in Machine Vision and Applications (2024)

  12. No Access

    Article

    GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

    Image restoration in adverse weather conditions is a difficult task in computer vision. In this paper, we propose a novel transformer-based framework called GridFormer which serves as a backbone for image rest...

    Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo in International Journal of Computer Vision (2024)

  13. No Access

    Article

    Residual feature learning with hierarchical calibration for gaze estimation

    Gaze estimation aims to predict accurate gaze direction from natural eye images, which is an extreme challenging task due to both random variations in head pose and person-specific biases. Existing works often...

    Zhengdan Yin, San** Zhou, Le Wang, Tao Dai, Gang Hua in Machine Vision and Applications (2024)

  14. No Access

    Article

    Convex–Concave Tensor Robust Principal Component Analysis

    Tensor robust principal component analysis (TRPCA) aims at recovering the underlying low-rank clean tensor and residual sparse component from the observed tensor. The recovery quality heavily depends on the de...

    Youfa Liu, Bo Du, Yongyong Chen, Lefei Zhang in International Journal of Computer Vision (2024)

  15. No Access

    Article

    Diagram Perception Networks for Textbook Question Answering via Joint Optimization

    Textbook question answering requires a system to answer questions with or without diagrams accurately, given multimodal contexts that include rich paragraphs and diagrams. Existing methods usually utilize a pi...

    Jie Ma, Jun Liu, Qi Chai, **hui Wang in International Journal of Computer Vision (2024)

  16. No Access

    Article

    Robust Unpaired Image Dehazing via Density and Depth Decomposition

    To overcome the overfitting issue of dehazing models trained on synthetic hazy-clean image pairs, recent methods attempt to boost the generalization ability by training on unpaired data. However, most of exist...

    Yang Yang, Chaoyue Wang, **aojie Guo in International Journal of Computer Vision (2024)

  17. No Access

    Article

    VNAS: Variational Neural Architecture Search

    Differentiable neural architecture search delivers point estimation to the optimal architecture, which yields arbitrarily high confidence to the learned architecture. This approach thus suffers in calibration ...

    Benteng Ma, **g Zhang, Yong **a, Dacheng Tao in International Journal of Computer Vision (2024)

  18. No Access

    Article

    EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

    Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical ...

    Jiangning Zhang, **angtai Li, Yabiao Wang in International Journal of Computer Vision (2024)

  19. No Access

    Article

    MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

    Existing multimodal conditional image synthesis (MCIS) methods generate images conditioned on any combinations of various modalities that require all of them must be exactly conformed, hindering the synthesis ...

    Jianbin Zheng, Daqing Liu, Chaoyue Wang in International Journal of Computer Vision (2024)

  20. No Access

    Article

    MixStyle Neural Networks for Domain Generalization and Adaptation

    Neural networks do not generalize well to unseen data with domain shifts—a longstanding problem in machine learning and AI. To overcome the problem, we propose MixStyle, a simple plug-and-play, parameter-free ...

    Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao **ang in International Journal of Computer Vision (2024)

previous disabled Page of 36