Non-Visual-Cueing-Based Sensing and Understanding of Nearby Entities in Aided Navigation
Juan Diego Gomez, Guido Bologna, Thierry Pun · 2012 · Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2012) · doi:10.1145/2384916.2384959
Summary
This demonstration paper presents a context-aware navigation aid system for blind individuals that combines three levels of assistance to enhance understanding of the surrounding environment. The system addresses a fundamental challenge: blind people navigating unfamiliar environments cannot perceive obstacles or make serendipitous discoveries because their mental model of the context is drastically limited without visual information. The hardware setup includes a Kinect-style range camera, high-quality headphones, a laptop, and an iPad tablet. The three assistance modules are: (I) an exploration module that lets users touch points on an iPad to explore a real-time 3D image captured by the range camera — touched points are encoded as spatialized sounds and sonic effects conveying color, position, and depth, with haptic trajectory feedback to convey spatial relationships; (II) an alerting method using range-imaging processing to detect obstacles in the user's path, predict trajectories of detected objects, and warn of potential stumbling hazards, also helping find clear paths for safe navigation; and (III) a recognition engine using state-of-the-art object recognition methods to learn and later identify natural objects in real time, informing users about the presence of previously learned objects during exploration.
Key findings
The system demonstrates that audio and haptic trajectory playback, coupled with computer vision methods, is a promising approach for conveying dynamic visual information about the immediate environment to blind users. The exploration module is particularly innovative in using spatialized sound to represent 3D spatial relationships — allowing users to understand depth and position through audio rather than touch alone. The three-tiered approach addresses different information needs: low-level obstacle avoidance for safety, mid-level environmental exploration for context awareness, and high-level object recognition for identifying specific items. The recognition engine operates at a higher level of abstraction than the other two modules, using tracking and bootstrapping methods for a training phase followed by online searching to identify learned objects in real time. The work is grounded in the neuroscience question of cross-modal transfer — whether stimuli normally conveyed through vision can be effectively replicated through other senses — and practical experience suggests this is achievable for conveying environmental context.
Relevance
This paper represents an early integrated approach to computer-vision-assisted navigation that combines multiple sensing modalities and levels of environmental understanding. For accessibility practitioners, the three-tier architecture (obstacle detection, environmental exploration, object recognition) provides a useful framework for thinking about what blind users need when navigating: not just collision avoidance, but contextual awareness and the ability to identify and learn about objects. The use of spatialized audio to convey 3D spatial information is a technique that has continued to mature and is increasingly used in modern navigation aids. The iPad-based tactile exploration interface, where touching different points on the screen produces audio representations of the 3D scene, bridges digital accessibility with physical navigation in an innovative way. As computer vision, depth sensing, and spatial audio technologies have advanced dramatically since 2012, this work's multi-modal, multi-level approach to conveying visual information non-visually remains architecturally relevant.
Tags: visual impairment · blind navigation · context-aware computing · computer vision · spatial audio · haptic feedback · obstacle detection · object recognition · depth sensing · cross-modal transfer