Enabling Access to Geo-referenced Information: Atlas.txt
Kavita E. Thomas, Livia Sumegi, Leo Ferres, Somayajulu Sripada · 2008 · Proceedings of the 2008 International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1368044.1368066
Summary
This paper from the University of Aberdeen and Carleton University presents Atlas.txt, a data-to-text natural language generation (NLG) system designed to make geo-referenced data — such as census maps and thematic choropleth maps — accessible to visually impaired users. The core problem is that governments publish census and demographic data as color-shaded maps where geographic patterns are immediately apparent to sighted users but completely inaccessible to blind users. While the underlying data tables can be read by screen readers, hearing a list of 31 values (e.g., for Scotland's council areas) does not convey the spatial patterns and trends that a sighted person grasps in under a minute from a map. The authors first conducted a usability study with 5 blind and 5 sighted participants from Ottawa, using census data from Statistics Canada and Scottish Census Results Online. The study had three parts: comparing blind participants' comprehension of tables versus texts, evaluating both groups' corrections of expert-written texts, and comparing task performance on texts versus maps/tables for familiar and unfamiliar geographic regions. Atlas.txt follows a pipeline architecture: raw geo-referenced data is analyzed using clustering algorithms and an electronic gazetteer to identify geographic patterns (maximal and minimal clusters, increasing/decreasing trends), which are then passed to an NLG module that plans and generates natural language text descriptions.
Key findings
The usability study revealed two major difficulties for blind participants: inferring trends from tabular data (because they had to remember and mentally rank dozens of values alongside their geographic region names) and mentally visualizing geographically unfamiliar areas even when given the data. Gaining an overview of the data was "uniformly very difficult" for blind participants, even those experienced in statistical data analysis. One professional data analyst who was blind noted that while screen readers make data accessible, they do not enable quick or easy overviews of general trends. Surprisingly, blind users also had more difficulty than sighted users interpreting expert-written texts, likely because the 200-300 word texts were dense and blind users could not skim for keywords as sighted users could. For unfamiliar regions, blind participants could not mentally locate areas even when hearing table values, whereas sighted participants could perceive geographic distribution patterns from maps regardless of familiarity. These findings led to key design decisions for Atlas.txt: texts should begin with geographic context (introducing the region, its location within a larger area, and salient geographic orientation features like rivers and coastlines) before describing data distributions, and should include brief location descriptions for named sub-areas to enable mental visualization. The system aims to communicate maximum values, minimum values, and trends — identified through corpus analysis as the most commonly communicated messages by expert statisticians.
Relevance
This paper addresses a critical and often overlooked accessibility gap: the inaccessibility of geographic and spatial data visualizations. While alt text for simple images is well understood, providing meaningful alternative access to complex data visualizations like maps remains an unsolved problem. The finding that simply reading out data tables is inadequate — because it fails to convey spatial patterns and relationships — has direct implications for any organization publishing geographic data, dashboards, or data visualizations online. The approach of using natural language generation to automatically produce textual descriptions anticipates the growing interest in AI-generated alt text and accessible data narratives. For accessibility practitioners, the study's insight that geographic familiarity is crucial for comprehension suggests that accessible map descriptions must include contextual geographic information, not just data values. The work connects to broader challenges in data visualization accessibility that remain highly relevant as dashboards and interactive maps become ubiquitous in government services, news, and public health reporting. The complementary relationship noted between textual descriptions and sonification (non-speech audio) approaches also points toward multimodal solutions for complex data accessibility.
Tags: natural language generation · visual impairment · data accessibility · geographic information · data visualization · screen readers · census data