Application of Content Adaptation in Web Accessibility for the Blind

Pauli P. Y. Lai · 2011 · Proceedings of the International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1969289.1969298

Summary

This paper proposes transforming web pages into hierarchical, numbered menu structures — modeled on Interactive Voice Response Systems (IVRS) — so blind users can navigate content by pressing number keys rather than listening sequentially through entire pages. The author identifies the core problem: screen readers read pages top-to-bottom, forcing blind users to listen through potentially hundreds of lines before reaching content of interest. Even JAWS's heading and link navigation shortcuts still operate within a fundamentally sequential paradigm. The proposed solution applies a reverse engineering process to web pages: parsing the HTML to extract semantic elements (text, images, links, form fields) while discarding container elements (tables, divs) and layout elements (bold, font tags), then analyzing relationships between adjacent semantic elements based on visual properties (font family, size, weight, color, content length, hyperlink). Four relationship types are identified: parallel (visually similar elements), enriching summary (heading with elaborating detail), enriching group (group heading with related parallel elements), and descriptive (image with its descriptive text). Related elements are grouped into logical sections, organized into a "semantic DOM" tree, and each section receives a descriptive heading and a number.

Key findings

The system generates multi-level numbered menus from web pages, demonstrated using Yahoo.com as an example. At each level, sections are presented as a numbered list (maximum 10 items per level, matching keypad digits). Users hear the menu headings and press the corresponding number to drill into a section, with multiple abstraction levels until reaching the ultimate content. Section headings are generated using a bottom-up approach (extracting headings from child content) followed by top-down numbering. Headings are prefixed with functional indicators ("Link:", "Form:", "Description:", "Enriching:") so users know what type of content they are entering. Special keys provide navigation: "*" to go back, "#" to replay options. The approach is designed to work on any device with audio output and a number keypad — including basic mobile phones — enabling blind web browsing via the phone's existing keypad without requiring touchscreen gestures or complex keyboard commands. The paper acknowledges tradeoffs: too many levels creates lengthy navigation, while too few means long menu lists, requiring intelligent balancing.

Relevance

This paper tackles a real and persistent problem — the inefficiency of sequential screen reader navigation — with an unconventional approach borrowed from telephony. The IVRS metaphor is clever because it leverages an interaction pattern that many blind users (and all phone users) are already familiar with, and works on basic mobile phones without touchscreens or specialized software. The semantic DOM concept of analyzing visual styling relationships to discover logical page sections anticipated later work on web page segmentation for accessibility. However, the approach has significant limitations: automatically generating meaningful headings for arbitrary content groupings is an unsolved problem, the multi-level menu navigation can itself become tedious, and the 10-item-per-level constraint may not accommodate the complexity of modern web pages. The system was not evaluated with blind users, so its practical usability is unknown. Modern screen readers have improved navigation considerably with landmarks, regions, and rotor features, though the core sequential navigation problem the paper identifies remains relevant for complex pages.

Tags: blind and low vision · content adaptation · screen readers · mobile accessibility · web page segmentation · navigation · DOM manipulation