Interactive SIGHT into Information Graphics
Seniz Demir, David Oliver, Edward Schwartz, Stephanie Elzer, Sandra Carberry, Kathleen F. McCoy · 2010 · Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1805986.1806009
Summary
This paper presents Interactive SIGHT (Summarizing Information GrapHics Textually), a system that provides visually impaired users with access to the high-level knowledge conveyed by bar charts in electronic documents. Unlike approaches that simply reproduce the graphic in alternative media (sound or tactile) or list raw data values, Interactive SIGHT infers the communicative intent of a bar chart — the message its designer intended to convey — and generates natural language summaries centered on that message. The system works as a browser extension (Browser Helper Object for Internet Explorer with JAWS). When a user encounters a graphic and presses CONTROL+Z, the system's Visual Extraction Module uses image processing to identify bar chart elements, creating an XML representation. An Intention Recognition Module then uses a Bayesian network to infer the chart's intended message from three types of communicative signals: the relative perceptual effort required for different visual tasks, visual highlighting of specific elements (like a differently colored bar), and suggestive verbs/adjectives in the caption. The Generation Module produces an initial summary conveying the inferred message plus salient features, then supports three types of history-aware follow-up responses: General (additional propositions ranked by a weighted PageRank algorithm over a relation graph), Focused (propositions categorized by information type relevant to the chart's message), and Specific (detailed data about individual bars or comparisons). The dialogue history ensures responses avoid redundancy and use discourse markers like 'Recall that' for previously communicated information.
Key findings
A corpus study of 100 randomly selected graphics from newspapers and magazines found that little or none of the graphic's underlying message was captured by the accompanying article text in over 60% of cases, confirming that graphics carry unique informational content that cannot be derived from surrounding text alone. In the first evaluation with 19 sighted graduate students, none found the initial summaries misleading, and there was no consensus on omitted propositions that should have been included, validating the content selection approach. Participants rated the system 3.7 out of 5 for satisfaction, with verbosity cited as the main concern. In the second evaluation with 7 visually impaired participants using JAWS, all correctly answered key questions about three bar charts using the system. Users rated usefulness at 9.4 out of 10 and ease of use at 8.7 out of 10. All participants used all three types of follow-up responses, but each followed different exploration paths through the same chart, validating the interactive approach over a one-size-fits-all summary. One participant remarked: 'I think that having a system that can describe bar charts to blind and visually impaired users is an extremely valuable resource. If this program had been available to me, I would have had the ability to function as everyone else would.' A key design insight was that congenitally blind users often cannot identify what further questions to ask about a chart they have never seen, motivating the menu-based interface rather than free-form questions.
Relevance
This paper pioneered an approach to graphic accessibility that goes beyond alt text to convey the communicative intent — the 'why' — of information graphics, not just their visual appearance or raw data. The insight that over 60% of graphics in popular media carry messages not captured in surrounding text means that skipping graphics represents a significant information loss for screen reader users, not merely a visual inconvenience. The system's approach of inferring designer intent through Bayesian reasoning anticipated modern AI-powered image description, though with more structured and domain-specific methods. The interactive dialogue model — providing a brief initial summary with drill-down options — offers a template for how complex visual content should be made accessible: not through exhaustive descriptions, but through layered access that lets users explore at their chosen depth. The finding that congenitally blind users need guided menus rather than open-ended questions has implications for any system presenting complex visual information to people who have never had visual experience. For practitioners today, the paper underscores that quality alt text for charts should convey the message and key comparisons, not just 'bar chart showing GDP data.'
Tags: visual impairment · data visualization · natural language generation · graph accessibility · bar charts · image accessibility · alternative text · Bayesian network · information retrieval · interactive dialogue · screen readers