Using simultaneous audio sources to speed-up blind people's web scanning

João Guerreiro · 2013 · Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility (W4A) · doi:10.1145/2461121.2461154

Summary

This doctoral consortium paper proposes using multiple simultaneous audio sources to help blind users scan web content faster. The core problem is that screen readers present information sequentially — one item at a time — while sighted users can visually scan an entire page at a glance, skimming headings or reading first sentences to quickly locate relevant content. The author draws on the Cocktail Party Effect, the well-documented human ability to focus attention on one voice among many simultaneous speakers and to detect important keywords (like one's own name) in unattended streams. The proposal leverages research showing that spatial separation of sound sources, differences in voice frequency, and asynchronous timing all improve the brain's ability to segregate simultaneous speech streams. Critically, the author notes that blind people — particularly those blind from birth or early childhood — show enhanced perceptual and attentional sensitivity for speech identification due to neuro-plasticity and sensory compensation, suggesting they may be especially well-suited to benefit from multi-source audio interfaces.

Key findings

This is an early-stage research proposal rather than a completed study, so empirical results are not yet available. The key theoretical contribution is reframing the scanning task: during web scanning, users do not need to comprehend every word from every source — they only need to identify which information stream contains content of interest and then focus attention on it. This is a divided attention task (monitoring all sources) rather than the selective attention task (attending to one source) that most previous cocktail party experiments tested. The author hypothesises that longer, more natural text passages will actually be easier to process in parallel than the short phrases used in laboratory speech perception experiments, because context and redundancy aid comprehension. The proposed experiments aim to find optimal configurations of spatial location, voice characteristics, and timing for web content specifically.

Relevance

This research addresses a fundamental asymmetry in web accessibility: sighted users can process visual information in parallel (scanning a page of search results at a glance), while blind screen reader users are forced into strictly sequential processing. Even with strategies like heading navigation and increased speech rate, blind users remain significantly slower at finding relevant content. The proposal to exploit the auditory system's parallel processing capacity — particularly blind users' enhanced auditory abilities — represents a creative approach to narrowing this gap. While this paper presents only the research direction, it points toward an important design space for future screen reader and audio interface development: moving beyond single-stream sequential output to leverage spatial audio and simultaneous sources for information triage tasks like scanning search results, email inboxes, or social media feeds.

Tags: blindness · screen readers · spatial audio · web navigation · auditory perception · information overload · non-visual interaction