Planning, Reasoning, and Agents for Non-Visual Navigation of Tables and Frames

Enrico Pontelli, Tran Cao Son · 2002 · Proceedings of the Fifth International ACM Conference on Assistive Technologies (Assets 02) · doi:10.1145/638249.638264

Summary

This paper from New Mexico State University extends previous work on a Domain Specific Language (DSL) for non-visual table navigation by reinterpreting it within the framework of formal action theory. The core problem addressed is that HTML tables and frames are inherently multi-dimensional visual structures that are extremely difficult for screen reader users to navigate, since screen readers linearise content into a one-dimensional stream. The authors had previously developed a DSL that allowed semantic descriptions of table navigation strategies to be attached to tables, enabling more meaningful traversal than simple cell-by-cell movement. This paper generalises that approach by modelling table navigation as a planning problem: the table's structure is represented as a conceptual graph capturing its navigational semantics, navigation actions are formalised with preconditions and effects, and an AI planner can automatically generate navigation sequences to reach user-specified goals. For example, a user browsing a travel information table could specify "find a hotel costing less than " and the system would automatically navigate to the relevant cell rather than requiring manual traversal. The system uses logic programming (specifically the DLV system and GOLOG language) to compute navigation plans and supports interactive execution where users can make choices during navigation.

Key findings

The paper demonstrates that framing table navigation as a planning problem enables three significant capabilities beyond traditional screen reader interaction. First, users can express navigation goals declaratively rather than issuing step-by-step movement commands — the system automatically computes navigation paths to satisfy those goals. Second, table authors or domain specialists can define partial navigation skeletons that combine predefined strategies with runtime flexibility, allowing navigation to adapt based on actual table content and user decisions. Third, the system supports temporal and procedural constraints that restrict or guide the planner — for instance, ensuring that a toxicity cell is always visited immediately after accessing a related chemical compound cell. The approach also introduces an agent-based execution model with on-line planning and monitoring, so the agent can detect when actions fail and backtrack to previous states. The authors argue this represents a shift from passive screen reader navigation to intelligent, goal-directed document exploration that leverages the semantic structure underlying visual table layouts.

Relevance

While this paper is highly theoretical and the proposed system was never widely deployed, it raises important questions about the gap between how sighted users perceive table information at a glance and the laborious cell-by-cell experience screen reader users face. The idea that users should be able to express what they want to find rather than how to navigate to it remains a compelling vision for accessible data exploration. Modern developments in AI and large language models have made aspects of this vision more feasible than when the paper was written. For practitioners, the paper underscores that table accessibility goes far beyond adding header markup — truly equivalent access requires understanding the semantic relationships within tabular data and providing navigation mechanisms that reflect those relationships. The work also highlights that complex data tables may need dedicated navigation strategies rather than generic screen reader commands.

Tags: table accessibility · screen readers · non-visual navigation · semantic navigation · artificial intelligence · domain specific language · HTML tables

Standards referenced: ADA · Section 508 · WCAG 1.0