MAIDR Meets AI: Exploring Multimodal LLM-Based Data Visualization Interpretation by and with Blind and Low-Vision Users

JooYoung Seo, Sanchita S. Kamath, Aziz Zeidieh, Saairam Venkatesh, Sean McCurry · 2024 · ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility · doi:10.1145/3663548.3675660

Summary

This paper investigates how blind and low-vision (BLV) users interact with large language models to interpret data visualizations, building on the authors' previously developed MAIDR (Multimodal Access and Interactive Data Representation) framework. MAIDR already provides multiple modalities for exploring data visualizations — including text descriptions, sonification (mapping data values to audio tones), and braille output — but the addition of an LLM-powered conversational interface (maidrAI) allows users to ask natural language questions about charts and receive AI-generated responses. The research team, which includes BLV members, co-designed maidrAI to present multiple AI responses side by side for each query, enabling users to cross-reference and critically evaluate LLM output rather than relying on a single answer. Eight BLV participants were tasked with interpreting box plots — a particularly challenging visualization type that encodes multiple statistical values (median, quartiles, whiskers, outliers) in a compact visual form. The study used a mixed-methods approach combining think-aloud protocols, interaction logs, and semi-structured interviews. Participants could freely switch between MAIDR's existing modalities (sonification, text descriptions, braille) and the new LLM chat interface, allowing the researchers to observe how AI-based interpretation complemented or competed with other accessible representations. The study examined three dimensions: how participants personalized LLM interactions through prompt engineering, their preferences for different types of visualization descriptions, and their strategies for verifying whether LLM responses were accurate.

Key findings

Participants exhibited diverse modal preferences — some relied heavily on sonification for initial data exploration then used the LLM for specific questions, while others preferred starting with text descriptions or going directly to the LLM chat. The study identified a critical trust-but-verify dynamic: participants valued the LLM's ability to provide natural language explanations of statistical concepts and data trends, but were aware of potential hallucinations and developed verification strategies including cross-referencing LLM responses against sonification patterns, comparing multiple AI responses for consistency, and asking the same question in different ways. Participants who had stronger data literacy tended to ask more targeted analytical questions and were better at detecting LLM errors. Prompt engineering emerged as an important skill — participants iteratively refined their queries to get more useful responses, with strategies ranging from specifying desired output format to providing context about their data literacy level. The multiple-response design was valued for enabling comparison, though some participants found it cognitively demanding. Participants expressed preferences for descriptions that balanced statistical precision with plain-language interpretation, and many wanted the LLM to proactively flag notable patterns or anomalies rather than requiring them to ask about every aspect of the data.

Relevance

This research sits at the intersection of two critical accessibility domains — data visualization access and responsible AI use by disabled populations. The MAIDR framework's multimodal approach (text + sonification + braille + LLM) provides a model for how multiple accessible representations can complement each other, with each modality serving different cognitive needs. The finding that BLV users develop sophisticated verification strategies for LLM output challenges assumptions that disabled users are passive consumers of AI-generated content. For accessibility practitioners, the work highlights that making data visualizations accessible requires more than alt text — it requires interactive, queryable representations that support the analytical depth sighted users get from visual inspection. The mixed-visual-ability research team composition is itself noteworthy, demonstrating the "nothing about us without us" principle in AI accessibility research. The implications extend to any domain where LLMs are used as accessibility mediators — trust calibration, multiple-response verification, and supporting user agency in AI interactions are broadly applicable design principles.

Tags: blind and low vision · data visualization · large language models · generative AI · sonification · screen readers · multimodal interaction · prompt engineering · accessibility