Page
%P
-
Chapter and Conference Paper
PANDA: Prompt-Based Context- and Indoor-Aware Pretraining for Vision and Language Navigation
Pretrained visual-language models have extensive world kno- wledge and are widely used in visual and language navigation (VLN). However, they are not sensitive to indoor scenarios for VLN tasks. Another challe...
-
Chapter and Conference Paper
ACT: Action-assoCiated and Target-Related Representations for Object Navigation
Object navigation tasks require an agent to find a target in an unknown environment based on its observations. Researchers employ various techniques, such as extracting high-level semantic information and buil...