Skip to main content

and
  1. No Access

    Article

    Sequential image encoding for vision-to-language problems

    The combination of visual recognition and language understanding is aim to build a commonly shared space between heterogeneous data of vision and text, such as the tasks of image captioning and visual question...

    Jicheng Wang, Yuanen Zhou, Zhenzhen Hu, Xu Zhang in Multimedia Tools and Applications (2021)

  2. No Access

    Chapter and Conference Paper

    Revisiting Knowledge Distillation for Image Captioning

    Knowledge Distillation (KD) [6], as an effective technique for model compression and improving a model’s performance, has been widely studied and adopted. However, most previous researches focus on image classifi...

    **g**g Dong, Zhenzhen Hu, Yuanen Zhou in Artificial Intelligence (2021)

  3. No Access

    Chapter and Conference Paper

    Video Captioning Based on the Spatial-Temporal Saliency Tracing

    Video captioning is a crucial task for video understanding and has attracted much attention recently. Regions-of-Interest (ROI) of video always contains the most interesting information for the audience. Diffe...

    Yuanen Zhou, Zhenzhen Hu, Xueliang Liu in Advances in Multimedia Information Process… (2018)