Audio Presentation of Auto-Suggest Lists

Andy Brown, Caroline Jay, Simon Harper · 2009 · Proceedings of the 2009 International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/1535654.1535667

Summary

This paper investigates how to make auto-suggest lists (ASLs) — the dropdown suggestions that appear as users type in search boxes and form fields — accessible through audio for visually impaired users. Part of the SASWAT (Structured Accessibility Stream for Web 2.0 Access Technologies) project at the University of Manchester, the research uses eye tracking to understand how sighted users interact with ASLs, then applies those insights to design an audio presentation strategy. The researchers noted that ASLs were ubiquitous: a survey of the Alexa top 20 sites found 12 instances. Yet screen readers at the time handled them poorly — Orca automatically read all suggestions without allowing manual navigation, while JAWS and HAL allowed manual browsing but did not announce when the list updated. The eye-tracking study used a Tobii 1750 tracker with 30 participants (17 male, 13 female, aged 18-34) who completed tasks on Kayak (flight booking), Google Suggest, and Yahoo! Search. Gaze data was analysed by defining Areas of Interest for the input box and each suggestion position (up to 6), measuring fixation percentage, time to first fixation, fixation count, and fixation duration for each position.

Key findings

Across 90 ASL encounters, 97.5% of participants fixated somewhere on the suggestion list, confirming that ASLs receive near-universal attention. However, attention dropped sharply by list position: on average 75% of participants fixated suggestion 1, 63% fixated suggestion 2, 39% fixated suggestion 3, and never more than 50% fixated any suggestion beyond position 3. The first suggestion received significantly more fixations (F2,56 = 6.16, p<0.005) and significantly longer fixation durations (F2,20 = 10.89, p<0.001) than lower positions. Crucially, some participants viewed the list extensively without ever selecting from it, suggesting ASLs serve a reassurance function — confirming that the user's input is reasonable — beyond simply offering shortcuts. Sighted users typically typed a few letters, glanced at suggestions, then continued typing or selected, repeating this cycle. Based on these findings, the authors proposed automatically speaking the first 3 suggestions when the user pauses typing, with arrow keys available to browse the full list, and Enter to select the currently spoken item. Continued typing would interrupt speech and resume normal input.

Relevance

This paper demonstrates the value of empirically studying sighted user behaviour before designing accessible alternatives to dynamic web content — an approach the same team also applied to calendar date pickers. The finding that attention drops dramatically after the third suggestion has practical design implications that remain relevant today: ARIA live region implementations that announce all suggestions (or none) are both suboptimal. Modern combobox/autocomplete ARIA patterns have evolved toward the approach suggested here — announcing a count of results and allowing arrow-key navigation — but implementation quality varies widely. The observation that ASLs serve a reassurance function beyond mere selection is an insight that could inform how screen readers present suggestions: rather than treating them purely as navigation targets, the audio presentation should convey that the system recognises the user's input as valid. For developers implementing accessible autocomplete components, the paper provides empirical justification for limiting automatic announcements to the top few results while making the full list browsable on demand.

Tags: visual impairment · screen readers · dynamic content · Web 2.0 · auto-suggest · autocomplete · eye tracking · AJAX · ARIA live regions · non-visual interaction · audio interface

Standards referenced: WAI-ARIA