In-Page Navigation Aids for Screen-Reader Users with Automatic Topicalisation and Labelling

Jorge Sassaki Resende Silva, Paula Christina Figueira Cardoso, Raphael Winckler De Bettio, Daniela Cardoso Tavares, Carlos Alberto Silva, Willian Massami Watanabe, Andre Pimenta Freire · 2024 · ACM Transactions on Accessible Computing · doi:10.1145/3649223

Summary

This paper addresses a fundamental challenge for screen reader users: navigating lengthy web documents that lack proper heading structure. When web pages do not include semantic headings or internal navigation links, users must read content linearly, which increases cognitive load and makes it difficult to locate specific information. The research presents a tool that automatically generates navigation aids by analyzing text content and inserting headers with descriptive labels. The project was conducted in two development cycles. In the first cycle, the authors developed algorithms using natural language processing techniques to segment text into topics and generate labels. The topic segmentation algorithm uses SBERT (Sentence-BERT) to create embeddings for each sentence, calculates similarity between sentences using cosine distance, and identifies topic boundaries where similarity drops significantly. The labeling algorithm then extracts keywords from each segment, ranks them by frequency, groups synonyms, and selects the most representative terms as labels. The first cycle included a user study with eight blind and partially-sighted screen reader users who completed tasks with and without automatically generated headers. Participants answered questions about four different texts while researchers measured cognitive load using a 10-point scale and tracked task completion time. Results showed preliminary indicators of reduced cognitive load when using texts with automatically generated headers, though participants criticized the quality of some labels. The second cycle involved co-designing an improved browser extension with two blind experts in web accessibility. Based on feedback from the first cycle, the team replaced the keyword-based labeling with OpenAIs ChatGPT API to generate more meaningful headers. The resulting Chrome extension activates via keyboard shortcut, scans page text, segments it using the BERT-based algorithm, sends each segment to ChatGPT for header generation, and injects h2 headings with internal navigation links at the page beginning.

Key findings

The first user study revealed that text segmentation itself was more valuable to participants than the specific header labels. Even when participants did not use the headers directly during tasks, they reported that having the text divided into sections helped them navigate, locate themselves within the document, and recall information during a second read. Participant P2-C1 described the experience as feeling like "running my eyes through the text," recreating a skimming experience typically unavailable to screen reader users. Brazilian screen reader users demonstrated different navigation behaviors than users in North America and Europe. While international surveys show heading navigation as the dominant strategy, Brazilian users more commonly read page content linearly using arrow keys, a pattern attributed to the history of DosVox, an early Brazilian screen reader that lacked heading navigation features. The second evaluation cycle with seven blind participants using the ChatGPT-enhanced browser extension revealed both improvements and persistent challenges. Participants appreciated the ability to use familiar screen reader shortcuts (like H and Shift+H for heading navigation) after the extension processed a page. However, ChatGPT generated inconsistent headers across activations, meaning users could not develop reliable mental models of page structure. Some participants found the AI-generated labels too long, containing complex vocabulary, or disconnected from the actual topic content. Processing speed emerged as a significant usability barrier. The time required to segment text and generate labels through the ChatGPT API discouraged some participants from using the tool, particularly for shorter documents where manual reading might be faster.

Relevance

This research demonstrates both the potential and limitations of using AI and NLP techniques to retrofit accessibility into poorly structured web content. For organizations considering similar approaches, the findings suggest that structural improvements (text segmentation) may deliver more consistent value than AI-generated labels, which remain unpredictable. The co-design process with blind experts yielded practical insights applicable to any assistive technology development. Experts recommended providing multiple navigation strategies beyond headings alone, adding internal links at page beginnings similar to skip links, ensuring keyboard shortcuts are easy to reach, and providing clear feedback when processing begins and ends. The research highlights that automatic tools cannot fully replace proper accessible authoring. Participants with experience in digital content design were skeptical about substituting AI-generated content for human-created accessible markup. The quality of ChatGPT outputs varied significantly, and some participants emphasized that automated tools should be viewed as supplements to, not replacements for, accessible development practices. For practitioners working with multilingual content, the study offers lessons about adapting NLP tools across languages. The authors found that algorithms trained on English corpora required significant adaptation for Portuguese, and resources for languages other than English remain limited. Future implementations should consider language-specific requirements from the outset.

Tags: screen readers · navigation · natural language processing · topic segmentation · large language models · assistive technology · headings · BERT · ChatGPT

Standards referenced: WCAG · ARIA