Visual Dialogue
Also known as: Visual Dialog, VisDial
Visual dialogue is an AI task that involves holding a multi-turn natural language conversation about visual content such as an image or video frame. Unlike single-turn visual question answering (VQA), visual dialogue systems maintain context across multiple exchanges, using dialogue history to provide coherent and consistent responses. This capability is important for accessibility because it allows blind and low vision users to iteratively explore visual content, asking follow-up questions to build a progressively detailed understanding of what is shown, rather than relying on a single static description.
Category: Artificial Intelligence · computer vision · natural language processing · visual accessibility
Related: Visual Question Answering · Image Captioning · Audio Description