Vision-and-Language Navigation
Also known as: VLN
Vision-and-language navigation is a task setup in which an agent follows natural-language instructions to move through a visual environment, grounding words like 'turn left at the blue sofa' onto what it sees in real time. Research in VLN has moved from small indoor simulators to real-world deployments powered by vision-language models. In accessibility, VLN pipelines underpin assistive navigation for blind travellers (e.g., SeeWay, WanderGuide, VLM-Drone), turning free-form spoken queries into step-by-step wayfinding guidance that accounts for visible landmarks and obstacles.
Category: Artificial Intelligence · Navigation and Wayfinding · AI and accessibility · Robotics
Related: Vision-Language Model · Grounding · Navigation