Access to Interpretation: How Formal Cues Ground Interpretive Alt Text for Paintings

Vera L. Zhong, Lucy Jiang, Kathryn E. Ringland · 2026 · Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI EA ’26) · doi:10.1145/3772363.3798632

Summary

This CHI 2026 Extended Abstract examines a gap between mainstream alt text conventions and the interpretive work that paintings are designed to evoke. The authors argue that dominant guidance for alt text foregrounds brevity, objectivity, and functional equivalence — an approach that serves many everyday images well but underserves artworks, where meaning emerges through formal attributes such as composition, color, light, gesture, and style. Rather than treating interpretation as a departure from objectivity, they propose treating it as a structured, accountable translation from observable cues. The study draws on the Art Institute of Chicago's Open Access Image dataset. The authors assembled the first 91 paintings that were labeled as paintings, were open access, and had institution-provided alt text. Two researchers independently conducted close readings of each painting and its alt text, documenting observable formal elements (composition, color palette, lighting, spatial arrangement, gesture, materiality) alongside analytic notes on what the alt text prioritized or omitted. They then met to compare notes and synthesize recurring patterns. From this corpus, the authors contribute a provisional taxonomy that maps objective formal cues to plausible interpretive responses (affect, symbolism, narrative) using explicit hedging language ("may suggest", "may evoke", "typically associated with") to distinguish observation from interpretation. They illustrate the taxonomy with Monet's Stack of Wheat, showing how compositional isolation and a cool palette can scaffold inferences about solitude or bleakness without overclaiming.

Key findings

Alt text across the 91-painting corpus was standardized in form but inconsistent in function. The majority of entries (65 of 91) used a material-only template of the form "A work made of [material]," and a further 16 used a material-plus-format variant. The phrase "A work made of oil on canvas" appeared 28 times across the corpus. Alt text was generally very brief: mean 10.2 words, median 8, maximum 54, standard deviation 6.79. Only 10 of 91 paintings had depictive alt text mentioning subjects, scenes, or other visual content. The authors observed a hierarchical pattern in which alt text privileged information already duplicated in metadata (medium, format) while leaving formal cues that actually carry meaning — symmetry, contrast, composition, lighting — unsupported. Their resulting taxonomy organizes interpretive authoring around three components: (1) objective formal cues grounded in formal analysis, (2) interpretive attributes plausibly supported by those cues (affect, symbolism, narrative/relation), and (3) hedging language that makes the inferential step transparent. This structure reframes subjectivity as an accountable translation rather than a departure from objectivity, and can scaffold both human authors and automated systems.

Relevance

For accessibility practitioners working with museums, galleries, digital collections, and cultural heritage platforms, this paper challenges the reflex to treat "objective, concise" alt text as universally sufficient. When applied to paintings, that reflex produces redundant metadata restatements that deny blind and low vision readers the formal scaffolding sighted viewers get for free. The taxonomy is immediately useful as an authoring checklist and as a specification for alt-text authoring tools or automated description systems: distinguish metadata from depiction, prompt for specific formal cues, and require hedged language for interpretive layers. Limitations are clearly acknowledged — a single institution, 91 works, no BLV reader evaluation yet — and the planned follow-up with BLV participants, curators, and critical discourse analysis will determine whether the taxonomy's inferences hold up in practice.

Tags: alt text · image description · blind and low vision · museum accessibility · cultural heritage · interpretive access · art analysis · screen readers

Standards referenced: WCAG 2.0 · WCAG 2.1