Design and Development of an Indoor Navigation and Object Identification System for the Blind
Andreas Hub, Joachim Diepstraten, Thomas Ertl · 2003 · Proceedings of the 6th International ACM SIGACCESS Conference on Computers and Accessibility (Assets '04) · doi:10.1145/1028630.1028657
Summary
This paper presents a multi-sensor orientation assistant for blind users navigating unknown indoor environments. The system addresses three core problems identified by blind users: determining one's position, determining head direction and movement direction, and identifying objects in the near and distant environment. The hardware consists of a hand-held sensor module shaped like a flashlight that integrates a stereo camera, keyboard, and cellular phone, designed to attach to a standard white cane so that only one hand is needed for both the cane and the device. The sensor module is augmented with a digital compass, 3D inclination sensor, and ultrasonic distance sensor, and connects via WLAN to a portable computer. The system combines local sensor data (colour detection from the stereo camera, distance measurement, compass heading) with pre-built 3D models of the building environment stored on networked servers. Colour detection uses a model of human colour vision and psychophysical algorithms to name colours via text-to-speech — early usability tests showed this was of great interest to blind users, who frequently discuss colour but have no independent means of determining it. The stereo camera also measures object distance and calculates physical dimensions (width, height). For indoor navigation, the system uses WLAN-based positioning to locate the user within a 3D model of the building, rendered using OpenSceneGraph. A virtual camera in the 3D scene corresponds to the user's position, and a "pick cone" cast from this camera matches objects in the 3D model against local sensor readings to identify what the user is pointing at.
Key findings
The system successfully demonstrated real-time colour detection and naming, object distance and size estimation via stereo camera, and matching of local sensor data against 3D building models for object identification. The colour detection algorithm — developed from psychophysical experiments with normally-sighted participants and incorporating models of chromatic induction and colour constancy — could name colours of objects without physical contact, which two blind test participants found particularly valuable for tasks like sorting clothes and identifying food. The stereo camera approach solved problems with auto-focus-based distance measurement (time delays, range limitations, inaccuracy in complex scenes). The 3D environment model of the university building's first floor was created with millimetre accuracy and included room furniture. The scene graph architecture allowed hierarchical object information retrieval — for example, identifying not just "door handle" but that it belongs to a door leading to a neighbouring room. The Nexus platform architecture used a federated server model with caching servers to manage the large 3D datasets without overloading the portable computer. Remaining challenges included 5-hour battery life, electromagnetic interference affecting compass readings, and the fundamental difficulty of creating and maintaining accurate 3D models of buildings.
Relevance
This paper represents an early and ambitious attempt at the kind of indoor navigation and object identification system that has become increasingly feasible with modern smartphones and computer vision. The multi-modal sensor fusion approach — combining visual (camera), spatial (compass, inclination, ultrasonic), and environmental (WLAN positioning, 3D building models) data — anticipates the architecture of contemporary indoor navigation solutions. The finding that colour identification was of particular interest to blind users highlights an often-overlooked aspect of environmental awareness: blind people want to know about aesthetic and social properties of their environment, not just navigational information. For practitioners, this work illustrates both the potential and the persistent challenges of indoor navigation for blind users: WLAN positioning accuracy limits remain relevant today, creating and maintaining 3D building models at scale is still a major obstacle, and the information filtering problem — determining what is relevant to communicate to the user without causing overload — continues to be a core design challenge. The integration of the sensor module with a standard white cane reflects an important design principle: new technology should complement rather than replace existing mobility tools that users trust.
Tags: indoor navigation · blindness and low vision · object recognition · orientation and mobility · mobile technology · computer vision · text-to-speech · wayfinding · sensors · 3D modeling
Standards referenced: 802.11