![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Article
Sequential image encoding for vision-to-language problems
The combination of visual recognition and language understanding is aim to build a commonly shared space between heterogeneous data of vision and text, such as the tasks of image captioning and visual question...
-
Chapter and Conference Paper
Revisiting Knowledge Distillation for Image Captioning
Knowledge Distillation (KD) [6], as an effective technique for model compression and improving a model’s performance, has been widely studied and adopted. However, most previous researches focus on image classifi...
-
Chapter and Conference Paper
Video Captioning Based on the Spatial-Temporal Saliency Tracing
Video captioning is a crucial task for video understanding and has attracted much attention recently. Regions-of-Interest (ROI) of video always contains the most interesting information for the audience. Diffe...