Improving the Accessibility of Aurally Rendered HTML Tables

Robert Filepp, James Challenger, Daniela Rosu · 2002 · Proceedings of the Fifth International ACM Conference on Assistive Technologies (Assets 02) · doi:10.1145/638249.638254

Summary

This paper from IBM's T.J. Watson Research Center proposes TTPML (Table To Prose Markup Language), an XML-compliant markup language designed to transform HTML tables into intelligible prose descriptions for blind and visually impaired users. The authors identify a fundamental problem: when screen readers linearize tables by reading cells left-to-right, top-to-bottom, the output is often incomprehensible because the relationships between column headers and data values are lost. For example, an Olympic cycling results table is rendered as "Cycling Track Country Gold Silver Bronze Total FRA 4 2 0 6 GER 2 2 2 6..." — a stream of disconnected numbers and labels. TTPML addresses this by allowing content providers to create reusable templates that specify how tables should be converted to prose. A TTPML model defines phrases (regions of cells with traversal rules), ornaments (contextual text inserted around cell values), and transformation rules for cell content. The model uses a hierarchical structure where phrases can contain embedded sub-phrases, allowing complex table structures to be described at multiple levels of detail. For the cycling example, TTPML produces: "Cycling Track. France has 4 Gold medals, 2 Silver Medals, 0 Bronze medals, and 6 Total medals. Germany has 2 Gold medals..." TTPML also supports JSML (Java Speech Markup Language) directives for controlling pronunciation, including gender, pauses, and rate/volume changes. Templates can be applied at origin servers, proxy servers, or browsers, and can be reused across tables with similar structures.

Key findings

A prototype implementation demonstrated the feasibility of TTPML-based table-to-prose transformation. The system was implemented as a SAX parser built with Apache Jakarta tools, using IBM's Via Voice for text-to-speech conversion. Performance measurements on a Pentium III processor showed that HTML parsing, TTPML processing, and sound file creation took 180ms for parsing, 212ms for prose and sound execution, and 1280ms for creating the sound file for a bicycle results table (total under 2 seconds). The resulting prose sound objects could be stored as compressed audio files ("watermarked" onto the HTML) for subsequent delivery, or streamed in real-time. For the bicycle results table, the raw HTML was 688 bytes, the TTPML template added 805 bytes, and the resulting .wav file was 630KB — though MP3 compression or streaming could significantly reduce this. The system handled both simple tables (cycling results with a single header row) and complex tables with embedded sub-tables (tennis results with match descriptions, set scores, and player names). The tennis example demonstrated conditional rendering (skipping empty cells) and the ability to limit detail through phrase spanning, allowing users to hear match summaries without excessive detail. The approach preserved copyright by keeping transformation rules separate from content — TTPML templates could be distributed independently without reproducing the original table content.

Relevance

This paper addresses a problem that remains significant today: HTML tables are still challenging for screen reader users, despite advances in assistive technology. While modern screen readers have improved table navigation (allowing users to move by row/column and hear header associations), the fundamental issue of tables being a visual format that loses meaning when linearized persists, particularly for complex or nested tables. The TTPML approach of using author-specified templates to generate prose descriptions anticipates modern accessibility techniques like ARIA table semantics and the ongoing discussion about whether tables should include prose summaries. For practitioners, the key insight is that automated table linearization alone is insufficient — meaningful aural presentation requires contextual reinforcement (repeating column headers with values) and ornamentation (adding connecting words like "has" and "medals"). The template-based approach, where one TTPML model can describe multiple tables with similar structure, offers a practical balance between fully automated (often poor quality) and fully manual (unsustainable) approaches to table accessibility. The research is limited by the lack of user testing with blind participants, which the authors acknowledge as future work.

Tags: web accessibility · tables · aural rendering · screen reader · XML · markup language · blindness · table linearization

Standards referenced: WCAG 1.0 · Section 508 · HTML 4.01