Page
%P
-
Chapter and Conference Paper
Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer
State-of-the-art transformer-based video instance segmentation (VIS) approaches typically utilize either single-scale spatio-temporal features or per-frame multi-scale features during the attention computation...