Detecting Objects and Obstacles for Visually Impaired Individuals Using Visual Saliency
Benoît Deville, Guido Bologna, Thierry Pun · 2010 · Proceedings of the 12th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2010) · doi:10.1145/1878803.1878857
Summary
This demo paper presents the detection module of See ColOr (Seeing Colors with an Orchestra), a mobility aid for visually impaired people developed at the University of Geneva. See ColOr transforms visual information from the environment into musical instrument sounds — a synesthetic approach that allows users to perform tasks like following a colored line, finding colored objects, or matching objects by color. The detection module described here extends the system to identify objects and obstacles that would be of interest or pose a threat to the user. The hardware consists of a lightweight (<200g) stereoscopic camera mounted on a protective helmet, connected to a laptop worn in a small case on the user's back, with standard headphones for audio output. The camera captures 640x480 color images and computes disparity (depth) maps in real-time at approximately 25Hz. The detection approach uses bottom-up visual saliency — modelling what would naturally attract a sighted person's visual attention — to identify areas of interest. The system computes specific feature maps (color, depth) depending on the scenario, combines them into conspicuity maps using center-surround difference operations, and generates a final saliency map. When a focus of attention (FOA) persists as the highest peak over a threshold number of frames, the user receives a vocal alert describing the object's position.
Key findings
Preliminary results from tests on stereo video clips depicting realistic navigation scenarios showed promising performance. The computing time for focus of attention detection was approximately 200ms per frame, enabling near real-time responsiveness to environmental changes. A key technical finding was that the approach is robust even when some feature maps are removed — the system maintains accuracy in detecting objects and obstacles comparable to using all available feature maps, which allows computational optimization for specific scenarios. The system addresses a genuine gap in existing mobility aids: small, flat, or aerial obstacles that are easily missed by white canes and guide dogs, such as objects on tables or obstacles at head height. The system delivers alerts as vocal sentences describing the object's global position, providing spatial context rather than just a warning signal. The authors planned real-life experiments with both blindfolded and visually impaired participants to validate the approach in practical navigation scenarios.
Relevance
This research represents an early example of applying computer vision and visual saliency models to real-time obstacle detection for blind and low-vision users — an approach that has since become central to modern assistive technology apps. The core insight that saliency-based detection can identify objects that traditional mobility aids miss (aerial obstacles, small objects on surfaces) remains relevant. For accessibility practitioners, the system illustrates important design principles: using sensory substitution (visual-to-audio) to convey spatial information, prioritizing real-time performance over perfect accuracy, and adapting processing to specific user scenarios rather than using a one-size-fits-all approach. While the 2010 hardware (helmet-mounted camera, laptop backpack) was bulky, the underlying algorithms anticipated the smartphone-based obstacle detection features now emerging in apps like Microsoft Soundscape and similar tools.
Tags: visual impairment · obstacle detection · mobility aid · computer vision · sensory substitution · visual saliency · assistive technology