Accessibility Evaluation based on Machine Learning Technique
Daisuke Sato, Hironobu Takagi, Chieko Asakawa · 2006 · Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility (Assets '06) · doi:10.1145/1168987.1169041
Summary
This paper from IBM Tokyo Research Lab proposes using machine learning to evaluate the accessibility of presentation documents, addressing a gap where traditional rule-based checking tools are insufficient. The authors argue that presentation documents are fundamentally different from HTML — they function as vector graphics composed of shapes, pictorial elements, text boxes, and embedded objects with properties like colour, font, style, and animation. Authors embed semantic information through visual arrangements that cannot be captured by declarative rules. For example, there is no obvious rule to determine whether the z-order of objects represents the correct reading order, or whether related objects have been properly grouped. While informal presentation accessibility guidelines existed (covering criteria like setting reading order, creating accessible data tables, grouping related objects, and avoiding conveying information through styling alone), no comprehensive evaluation tools existed. The approach uses a support vector machine trained on visual features extracted from presentation slides, with the model learning relationships between visible appearance characteristics and accessibility levels.
Key findings
The system extracts several categories of visual features for machine learning: positional relationships between objects (distances and angles based on z-ordering), counts and display areas of different object types (text boxes, pictorial objects with/without alt text, embedded objects, grouped objects, page titles, outline texts), total overlap areas between objects (indicating relationships), total and average character counts (where low averages suggest fragmented information), and number of animations (which are difficult for visually impaired users to understand). The prototype was implemented for OpenDocument Format files and evaluates each slide on a five-level accessibility scale, displaying results in a list view alongside slide titles. Users can override the system's assessment, and their corrections feed back into the learning model, improving accuracy over time. Multiple users can share learning models for collective improvement. In early experiments with approximately 200 training slides, the system achieved 97% recall but only about 60% accuracy — not yet sufficient for effective evaluation, though the authors attributed this to the small training set and believed accuracy would improve with more data and better feature selection.
Relevance
This paper is an early and significant exploration of applying machine learning to accessibility evaluation — a concept that has become increasingly mainstream with the rise of AI-powered accessibility tools. The insight that semantic accessibility criteria (like correct reading order or meaningful object grouping) cannot be captured by declarative rules but can potentially be learned from visual patterns was ahead of its time. The user feedback loop, where corrections improve the model over time, anticipated modern active learning and human-in-the-loop AI approaches. While the 60% accuracy was modest, the 97% recall meant the system was good at flagging potential issues even if it generated false positives — a reasonable trade-off for an assistive evaluation tool. For current practitioners, this paper provides historical context for today's AI-based accessibility checkers and reinforces that presentation accessibility requires attention to visual semantics that go beyond what simple rule-based tools can evaluate. The work also connects to the same IBM team's DocExplorer research on making presentation documents accessible to screen readers.
Tags: automated testing · machine learning · presentation accessibility · accessibility evaluation · support vector machine · visual features
Standards referenced: Section 508 · WCAG 1.0 · OASIS ODF