Surfacing Variations to Calibrate Perceived Reliability of MLLM-generated Image Descriptions
Meng Chen, Akhil Iyer, Amy Pavel · 2025 · ASSETS 2025: 27th International ACM SIGACCESS Conference on Computers and Accessibility
This paper addresses a critical safety problem in AI-powered visual access technology: multimodal large language models (MLLMs) like GPT-4o, Gemini, and Claude produce fluent, confident image descriptions that can contain fabricated content, misinterpretations, and omissions…
blindness · low vision · image descriptions · multimodal AI · large language models