Diffscriber: Describing Visual Design Changes to Support Mixed-Ability Collaborative Presentation Authoring
Yi-Hao Peng, Jason Wu, Jeffrey Bigham, Amy Pavel · 2022 · Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (UIST) · doi:10.1145/3526113.3545637
Summary
This paper presents Diffscriber, a system that identifies and describes visual design changes made to slide presentations, enabling blind and visually impaired (BVI) presenters to meaningfully participate in collaborative slide authoring with sighted collaborators. The research begins with a formative study of nine BVI presenters (ages 21-58) who regularly give presentations. The study revealed a common workflow: BVI presenters author text content (typically in a Word document or simple slide template), then hand it to a sighted collaborator who transforms it into a visually designed presentation. The critical gap is that BVI presenters cannot independently review or provide feedback on the visual changes their collaborators make — changes to content (added images, revised text), style (fonts, colours, sizes), and layout (spatial arrangement of elements). Diffscriber addresses this by implementing a Google Slides extension that compares original and revised slide versions. The system uses CLIP, a multimodal machine learning model, to establish correspondences between elements across slide versions, then applies rule-based classifiers to detect content additions, replacements, removals, style changes, and layout changes. It generates hierarchically organized natural language descriptions that BVI authors can navigate at multiple levels of detail — from high-level summaries down to individual element properties.
Key findings
In a user study with six BVI presenters reviewing professionally redesigned slides, participants using Diffscriber identified nearly three times as many changes (mean 28.83) compared to using accessible slides alone (mean 10.33), a statistically significant difference (p < 0.00001). They also provided significantly more feedback comments (mean 2.83 vs. 0.83, p < 0.01). All participants unanimously preferred Diffscriber over accessible slides alone and over their prior experience with PowerPoint. The system achieved 77% accuracy in establishing correspondences between original and revised slide elements, with recall generally higher than precision — meaning BVI users were likely to learn about the majority of edits. Participants rated content changes as most useful for authoring, followed by style and layout changes. Notably, participants expressed enthusiasm about learning design patterns from the change descriptions, saying the tool could help them develop their own visual design sensibilities over time.
Relevance
This work addresses a significant and underexplored accessibility barrier: the exclusion of blind people from visual design collaboration. While much accessibility research focuses on making content consumable, Diffscriber tackles the harder problem of making the creative authoring process accessible. The findings have broad implications beyond presentations — the same principles of detecting and describing visual changes could apply to collaborative design in documents, websites, marketing materials, and any visually rich media. For accessibility practitioners, the paper highlights that making slides screen-reader accessible (adding alt text, correct reading order) is necessary but insufficient; BVI authors also need to understand and have agency over visual design decisions. The formative study findings about collaborative workflows provide valuable insights for organizations seeking to support mixed-ability teams in content creation roles.
Tags: blind and low vision · mixed-ability collaboration · presentation accessibility · authoring tools · screen readers · image descriptions · change detection
Standards referenced: WCAG 2.0