Efficient and Effective Information Finding on Small Screen Devices

Pauli P. Y. Lai · 2013 · Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/2461121.2461127

Summary

This paper proposes a reverse engineering approach for automatically adapting desktop-oriented webpages for efficient information seeking on small-screen mobile devices. The core contribution is a "semantic-DOM tree" model that analyzes the relationships between semantic elements on a webpage — stripping away presentational and container HTML elements to retain only content-bearing nodes. The system identifies three types of relationships between elements: parallel (visually similar elements like lists of news headlines), enriching (a title with associated detail content), and descriptive (images with their associated text or alt attributes). Using these relationships, the system groups related elements into logical sections and generates multiple adapted views. The relationship detection uses machine learning — specifically anomaly detection with multivariate Gaussian models — to classify parallel relationships, combined with DOM structure rules for enriching and descriptive relationships. The system was evaluated on over 80 webpages, achieving F-measures of 0.82 for parallel relationships, 0.77 for enriching relationships, and 0.86 for descriptive relationships.

Key findings

The paper presents five adaptation strategies generated from the semantic-DOM tree model: depth-first (sequential content presentation segmented into sub-pages), breadth-first (hierarchical menu navigation by section), hybrid (combining both approaches to reduce navigation depth), main-content (filtering out navigation and peripheral content using a LinkRatio heuristic), and page-map (providing a topic overview with jump links). A task-based evaluation with over 200 university students using iPhones, Android devices, and Opera Mini Simulator tested information-finding tasks on the Wikipedia main page. The adapted versions (particularly the main-content adaptation) outperformed both the original desktop page and Google Mobile Proxy's thumbnail approach in both effectiveness (correct answers) and efficiency (completion time) on Android and Opera browsers. The main-content adaptation performed best overall because removing peripheral content made it easier to locate answers. However, iPhone users were sometimes faster with the original page, likely because they were accustomed to the pinch-to-zoom interaction pattern. The Google Mobile Proxy thumbnail approach was consistently the worst performer across all browsers.

Relevance

While this paper predates the widespread adoption of responsive web design, its underlying principles remain relevant for accessibility. The semantic-DOM tree concept — analyzing content relationships rather than relying on specific HTML markup — addresses a real problem: many webpages are poorly structured, and automated tools need to work with imperfect markup. The content adaptation strategies (especially main-content filtering and page-map overviews) parallel techniques used by screen readers and reading modes today. For accessibility practitioners, the finding that removing peripheral content significantly improves task performance reinforces the importance of clear content hierarchy and the value of features like reader modes. The paper's limitation is that it focuses on information-seeking efficiency for general users rather than specifically addressing disability-related access barriers, though the techniques have clear applications for users who navigate content non-visually or with limited motor control on small devices.

Tags: mobile accessibility · content adaptation · responsive design · web accessibility · information architecture · semantic HTML · machine learning