Can a Blind Person Understand Your World?

Chieko Asakawa · 2014 · Proceedings of the 11th Web for All Conference (W4A) · doi:10.1145/2596695.2596702

Summary

This keynote paper by Chieko Asakawa of IBM Research presents a vision for the next frontier of accessibility for blind people: moving from digital information access to real-world understanding through cognitive computing. Asakawa charts the exponential growth of accessible information through three inflection points — paper Braille, digital Braille (1980s onwards), and voice-based web access (late 1990s) — and argues that the next revolution will be "real-world information access" powered by machine learning and crowdsourcing. She introduces the concept of "cognitive assistance": systems that use cognitive computing to recognise people, objects, and environments and describe them audibly to blind users. The paper describes two concrete applications. First, "Understanding People" uses augmented reality devices like smart eyeglasses to help blind people in face-to-face communication by identifying nearby people, reading their emotions, and detecting gestures. Second, "City Companion" combines computer and human intelligence for urban navigation, using GPS-based semantic analysis, object recognition, and crowdsourced descriptions when automated recognition fails. Asakawa also introduces the Accessible Photo Album (APA), a mobile app that lets blind users capture photos with audio descriptions and geolocation data, challenging the assumption that photography is only meaningful for sighted people.

Key findings

Asakawa introduces "crowd accessibility" as a hybrid approach that combines human intelligence with machine intelligence to bridge the gap until fully automated recognition matures. In this model, micro-tasks are distributed to human workers to supplement machine recognition — for example, describing objects in photos that automated systems cannot identify. Critically, the human-provided data simultaneously trains the machine learning systems, creating a virtuous cycle where crowd accessibility gradually gives way to automated assistance. The paper makes a compelling historical argument that accessibility needs have repeatedly driven mainstream technology innovation — Bell invented the telephone while researching communication for hearing-impaired people, the keyboard originated as an aid for motor disabilities, and mobile voice interfaces were importantly shaped by blind users' needs. Asakawa predicts this pattern will repeat with cognitive assistance, where the "extreme needs" of blind users will motivate inventions that ultimately benefit everyone, framing this as an era of "assisted cognition" for all people.

Relevance

Written in 2014, this paper proved remarkably prescient. The cognitive assistance systems Asakawa envisioned have since materialised in tools like Be My Eyes (crowd accessibility), Microsoft Seeing AI (image recognition for blind users), and various AI-powered scene description apps. Her framing of accessibility as a driver of mainstream innovation remains a powerful argument for investment in accessibility research. The crowd accessibility concept — using human intelligence to supplement machines while simultaneously training them — anticipated the human-in-the-loop AI paradigm that now underpins much of modern machine learning. For practitioners, the paper reinforces that accessibility is not a niche concern but a frontier that pushes technology forward for everyone, and that combining crowdsourcing with AI offers practical paths to solving recognition problems that neither humans nor machines can handle alone at scale.

Tags: blindness · cognitive assistance · image recognition · computer vision · machine learning · crowdsourcing · assistive technology · navigation · augmented reality