Modelling web navigation with the user in mind

Ruslan Fayzrakhmanov, Max Göbel, Wolfgang Holzinger, Bernhard Krüpl, Andreas Mager, Robert Baumgartner · 2010 · Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1805986.1806006

Summary

This paper introduces the ABBA (Advanced Barrier-free Browser Accessibility) framework, a novel approach to screen reader design that replaces traditional sequential DOM-based navigation with a multi-axial navigation model. The core insight is that current screen readers force blind users through a single linear reading order derived from the DOM tree, losing the layout information, spatial relationships, and multiple navigation strategies that sighted users intuitively exploit. ABBA instead creates multiple "navigation axes" — different linear serializations of page elements that each represent a distinct navigation strategy (e.g., jumping between articles, following menu items, traversing sidebar content, or moving between columns). These axes are superimposed into a navigation grid where intersection points allow users to switch strategies dynamically, analogous to changing trains at stations in a transit system. The framework uses Web Information Extraction (WIE) based "enrichers" that automatically analyze the visual rendering of a page — not just its HTML source — to extract four types of document semantics: layout semantics (geometric relations), content semantics (textual information), interaction semantics (all page interactions), and site semantics (site-wide meta information). These are unified into an RDF-based ontological model that flattens the hierarchical DOM tree into a rich, queryable triple store.

Key findings

The paper identifies four categories of document semantics that are lost when web pages are serialized for screen readers: layout, content, interaction, and site semantics. The ABBA architecture addresses this through a pipeline where GeoDump (a WebKit-based component) extracts the visual rendering of each page element — including content, bounding boxes, and rendered styles — into RDF triples. Enrichers then layer additional facts onto this model, such as topological neighborhood relations (northOf, westOf), alignment patterns, and inferred headings (by analyzing font size and style clustering when HTML heading tags are missing). The navigation component automatically identifies axes from this enriched model. The train map analogy elegantly captures the user experience: expert users take "fast trains" (high-level axes with few stops) to get an overview, then switch to "slower trains" (fine-grained axes) when close to their target. User interactions are simple: read current text, change axis, list available axes, get orientation (breadcrumb-style feedback showing the path from document root to cursor), and backtrack. The system also includes a graphical annotation tool for manually specifying axes where automatic detection is insufficient.

Relevance

This research addresses a fundamental limitation of screen reader navigation that persists today: the forced linearization of inherently spatial, multi-dimensional web content. While modern screen readers have improved with ARIA landmarks and heading navigation, they still fundamentally offer only a single reading order augmented by element-type jumping — far from the multi-strategy navigation ABBA envisions. The concept of navigation axes as transferable patterns across unrelated websites is particularly powerful: if a user learns that news sites have article axes, menu axes, and sidebar axes, they can apply this knowledge to unfamiliar sites. The enricher concept — automatically inferring document structure from visual rendering rather than relying on correct HTML markup — is especially relevant given that most websites still have imperfect semantic markup. The ontological approach of representing page structure as flat RDF triples rather than a DOM tree offers a more flexible foundation for accessibility analysis. For screen reader developers and researchers, this paper provides a compelling theoretical framework for moving beyond sequential navigation.

Tags: screen readers · web navigation · visual impairment · semantic web · ontology · web information extraction · document structure

Standards referenced: WCAG · WAI-ARIA