"That's in the eye of the beholder": Layers of Interpretation in Image Descriptions for Fictional Representations of People with Disabilities

Emory James Edwards, Kyle Lewis Polster, Isabel Tuason, Emily Blank, Michael Gilbert, Stacy Branham · 2021 · Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '21) · doi:10.1145/3441852.3471222

Summary

This paper investigates how to create accurate and sensitive image descriptions for fictional representations of people with disabilities — a challenge that arises when real subjects cannot be consulted about their preferred identity terminology. The study emerged from Google's "Avatar Project," which commissioned an artist to create 30 diverse illustrations of fictional users for use in the company's internal design system by designers and developers. The researchers worked with 25 participants with disabilities through nine focus groups and nineteen follow-up interviews, totaling nearly 28 hours of dialogue. Participants iteratively assessed and co-designed image descriptions for nine of these fictional avatar illustrations, six of which depicted visible disabilities or assistive devices (a white cane user, someone with vitiligo, a skateboarder with a prosthetic hand, a service dog, a person signing in sign language with a hearing aid, and an electric wheelchair user). Three images without visible disability were included to explore invisible disability representation. Participants were intentionally recruited to include people with multiple marginalized identities — including varied racial, gender, and disability identities — to surface how intersecting identities complicate description writing. The study reveals that image descriptions involve multiple "layers of interpretation": the artist interpreting an identity, the describer interpreting the image, and the reader interpreting the description, with biases potentially introduced at each stage.

Key findings

Five key themes emerged from the analysis. First, disability should be described in context — foregrounded only when the image's purpose and audience demand it, not automatically included. Second, participants had deeply personal and sometimes conflicting preferences for how disability should be described; only three of sixteen who provided self-descriptions mentioned disability, often due to associated stigma. Many preferred describing assistive devices rather than applying disability identity labels, as devices are more concrete and less presumptuous. Third, there was significant tension around level of detail — concise descriptions serve screen reader efficiency, but more detail can educate non-disabled designers about disability. Fourth, participants identified multiple sites where bias enters descriptions: the writer's subjective perspective, the reader's prior experiences, and the artist's original intent all shape the final interpretation. Regarding race, participants largely preferred describing skin tone over assuming racial identity. Fifth, participants identified ocularcentrism — the privileging of sighted perspectives — as pervasive even within image accessibility work itself, noting that image descriptions are part of a fundamentally vision-centric digital environment. Participants argued that non-visual information should also have a place in descriptions, and that image descriptions can be an artform rather than mere visual translation.

Relevance

This research provides essential guidance for anyone writing image descriptions in professional contexts — design systems, marketing materials, educational content, and web development. The concept of "layers of interpretation" offers a practical framework for understanding how bias enters descriptions at every stage of the content pipeline, from artist to describer to reader. The finding that participants preferred describing assistive devices over applying disability labels has direct implications for alt text guidelines and automated image description systems. The study makes a compelling case that accessibility cannot be retrofitted through post-hoc description alone; people with disabilities and other marginalized identities need to be involved throughout the content creation pipeline, from image conception to final description. For practitioners, the summarized considerations in Table 3 provide actionable guidelines organized around four questions: who is describing, where the image is situated, how to describe visual elements, and how to avoid misrepresentations. The tension between conciseness and educational detail is particularly relevant for organizations building inclusive design systems.

Tags: image accessibility · alt text · image descriptions · disability representation · design systems · ableism · blind and low vision · co-design · identity

Standards referenced: WCAG 2.1