Towards Making Mathematics a First Class Citizen in General Screen Readers
Volker Sorge, Charles Chen, T. V. Raman, David Tseng · 2014 · Proceedings of the 11th Web for All Conference (W4A) · doi:10.1145/2596695.2596700
Summary
This paper presents a comprehensive approach to integrating mathematical speech translation into ChromeVox, Google's open-source screen reader for the Chrome browser and Chrome OS. The authors address the fundamental challenge that mathematical notation on the web exists in three distinct formats — pure MathML markup, MathJax-rendered content, and pre-rendered images with hidden LaTeX or AsciiMath markup — and a general screen reader must handle all of them uniformly. The solution builds on ChromeVox's four-axis architecture: granularities (segmenting the DOM into navigable units), walkers (interactive content navigation), speech output (customizable text-to-speech generation), and alternative representations (swapping DOM elements for more accessible versions). For math-as-images, the system detects hidden markup in alt attributes or class names (e.g., Wikipedia's LaTeX-in-images, MathWorld's AsciiMath), sends it to MathJax running as a web service for translation to MathML, and stores the result for the speech rule engine. This means over 50,000 Wikipedia pages and 10,000 MathWorld pages with image-based math become accessible without any changes to those sites.
Key findings
The paper introduces a flexible speech rule engine that translates MathML into natural-sounding utterances using condition/action pairs based on XPath expressions. Rules can be customized along multiple dimensions: mathematical domain (algebra, geometry, calculus, logic), reading style (verbose vs. brief), and prosody (pitch, rate, volume, pauses, earcons). Spatial layout is conveyed through pitch changes — raising pitch when moving up in a fraction and lowering it when moving down. Two novel interactive exploration methods are presented: tree exploration, which follows the MathML tree structure allowing linear traversal of sub-expressions, and level-based exploration, which defines nested granularities allowing users to move between sub-expressions at one level and dive deeper with arrow keys. The level-based approach fits naturally into ChromeVox's existing content navigation paradigm. A key innovation is the semantic representation that rewrites MathML into a more human-oriented tree structure, recognizing patterns like function applications, operator precedence, and matrix structures to produce more natural speech output — for example, reading "4ac" as "4 times a times c" rather than pronouncing each symbol individually. The system was evaluated internally by blind and visually impaired colleagues at Google.
Relevance
This paper represents a landmark contribution to STEM accessibility by demonstrating that mathematical content can and should be handled by general-purpose screen readers rather than requiring expensive specialist software. The work is particularly significant because it addresses the real-world diversity of how math appears on the web — not just ideal MathML but also the messy reality of images with hidden markup that dominates sites like Wikipedia. The speech rule engine developed here later became the standalone open-source Speech Rule Engine (SRE) project, which now powers math accessibility in multiple screen readers and is used by MathJax itself. The interactive exploration techniques — allowing users to navigate complex formulas at different levels of detail rather than hearing them as one long utterance — set the standard for how math accessibility works in modern assistive technology. For STEM educators and web developers, this paper makes a compelling case that math accessibility is achievable with existing web standards and does not require fundamental changes to how content is authored.
Tags: screen readers · mathematics accessibility · MathML · ChromeVox · text-to-speech · STEM accessibility · semantic enrichment · interactive exploration · speech rules
Standards referenced: MathML · HTML5 · EPUB 3