← All reviews

Experimental Analysis of a Spatialised Audio Interface for People with Visual Impairments

Jacobus C. Lock, Iain D. Gilchrist, Grzegorz Cielniak, Nicola Bellotto · 2020 · ACM Transactions on Accessible Computing · doi:10.1145/3412325

Summary

This paper evaluates a spatialised audio interface designed to help people with visual impairments find objects in indoor environments. Part of the ActiVis project, the system uses a Google Project Tango mobile device with RGB-D cameras for localization and Aftershokz bone-conduction headphones to deliver audio guidance. Bone-conduction headphones were specifically chosen because they transmit sound through the cheekbones rather than blocking the ear canal, preserving the user's ability to hear ambient environmental sounds—a critical safety requirement for blind navigation. The interface conveys target direction in 3D space: horizontal (pan) direction uses a Head-Related Transfer Function (HRTF) to spatialize audio with binaural cues, while vertical (elevation) direction uses pitch variation (high pitch for targets above, low pitch for below) to compensate for the loss of spectral information inherent in bone conduction. The researchers conducted experiments with two groups: 42 blindfolded sighted university students (mean age 20) and 10 participants with severe visual impairments (mean age 61, including 7 totally blind). Pre-screening measured each participant's sound localization ability and pitch discrimination threshold.

Key findings

Both participant groups demonstrated comparable accuracy in localizing sound sources, with no statistically significant differences in pointing accuracy between blindfolded sighted participants and those with actual visual impairments. This validates the common research practice of using blindfolded participants as proxies in accessibility studies. The pan (horizontal) dimension showed strong performance independent of pitch settings, with mean absolute errors around 0.25-0.26 radians. For elevation, the highest pitch gradient setting (64-4096 Hz range) produced the smallest angular errors (0.36 radians vs 0.42-0.44 for lower settings). However, a speed/accuracy tradeoff emerged: higher pitch gradients increased pointing accuracy but required more time to reach targets. A key contribution is demonstrating that Fitts's Law—a predictive model of human movement—applies to this audio interface, with strong Pearson correlations (r = 0.71-0.98) between target difficulty and acquisition time. This provides a quantitative framework for optimizing audio interface parameters. The blindfolded group showed faster time-to-target performance than the visually impaired group, possibly due to age differences or device familiarity, but accuracy was equivalent.

Relevance

This research advances the design of electronic travel aids (ETAs) by demonstrating that bone-conduction headphones—despite their limitations in conveying elevation through spectral cues—can achieve performance comparable to traditional over-ear headphones when pitch variation is used instead. The preservation of ambient hearing is crucial for real-world navigation safety. The application of Fitts's Law to audio interfaces provides developers with a validated metric for comparing design alternatives and optimizing parameters like pitch gradient. For practitioners building navigation aids, the finding that the "hi" pitch setting (widest frequency range) produces best elevation accuracy—but at a time cost—suggests that adaptive systems could adjust parameters based on whether speed or precision is more important for a given task. The comparable accuracy between blindfolded and visually impaired participants supports the ethical practice of conducting initial accessibility research with blindfolded sighted users before involving the target population.

Tags: visual impairment · blindness · navigation · spatial audio · bone conduction · electronic travel aid · Fitts Law · audio interface · object finding