Access on Demand: Real-time, Multi-modal Accessibility for the Deaf and Hard-of-Hearing based on Augmented Reality

Roshan Mathew, Brian Mak, Wendy Dannels · 2022 · Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '22) · doi:10.1145/3517428.3551352

Summary

This experience report documents two deaf researchers' hands-on evaluation of Access on Demand (AoD), an augmented reality application developed at Rochester Institute of Technology that delivers real-time captioning and American Sign Language (ASL) interpretation through Vuzix Blade AR smart glasses. The paper is notable for centering the perspectives of Deaf and Hard-of-Hearing (DHH) users themselves — both researchers are deaf, with distinctly different communication preferences and technological backgrounds. Mak, a hard-of-hearing undergraduate journalism student who primarily uses ASL, tested the live interpreting feature, while Mathew, a late-deafened graduate computing student who relies on captioning, evaluated both real-time human captioning and auto-captioning modes. AoD works by streaming either a live ASL interpreter video feed or text captions directly into the smart glasses display. The system requires a secondary device (phone or laptop) running in 'presenter mode' to capture audio via its microphone and relay it to the remote interpreter or captioner. For auto-captioning, AoD uses Web Speech API for automatic speech recognition. The platform was developed by a small team of Deaf student developers at RIT, with support from the XR Accessibility Solutions Laboratory at the National Technical Institute for the Deaf. Testing took place in real-world settings including campus dining locations, office conversations, comedy shows, and wedding speeches — situations where traditional accessibility accommodations are often unavailable or impractical.

Key findings

The most significant benefit reported was 'glanceability' — the ability to view captions or an interpreter in the smart glasses while maintaining eye contact with the speaker, eliminating the need to constantly shift gaze between the speaker and a separate captioning screen. Mathew noted that conversation partners found interactions more natural when he could look directly at them. However, both researchers identified substantial hardware and usability limitations. The Vuzix Blade had only about one hour of battery life under AoD's streaming demands, generated noticeable heat, and required Wi-Fi connectivity that was not always reliable. The requirement to hold a phone in presenter mode meant one-handed signing for ASL users, effectively preventing two-way communication for Mak. Caption positioning was fixed in the center of the field of view, blocking the speaker's face, with no option to reposition horizontally. Auto-captioning failed when speakers talked too fast or were more than six feet from the microphone, and lacked punctuation, line breaks, and speaker identification — making group conversations particularly difficult. The bulky form factor drew unwanted attention in public, with bystanders concerned about being recorded. For users wearing cochlear implants or hearing aids, the glasses' thick frames caused physical interference, requiring constant readjustment.

Relevance

This paper provides valuable first-person DHH user perspectives on AR-based accessibility — a technology category with significant potential but clear current limitations. For accessibility practitioners, the key takeaway is that the concept of on-demand, wearable captioning and interpretation is sound and addresses a real gap: many everyday situations (social events, informal conversations, public venues) lack accessibility accommodations. The glanceability benefit alone represents a meaningful improvement over phone- or laptop-based captioning. However, the findings also serve as a reality check on current AR hardware limitations — battery life, form factor, connectivity, and display positioning all need substantial improvement before mainstream adoption is viable. The paper's emphasis on supporting multiple communication modalities (ASL interpretation, human captioning, and auto-captioning) reflects the diversity of DHH users' preferences and needs, reinforcing that accessibility solutions must be flexible rather than one-size-fits-all. As AR hardware continues to evolve with lighter, longer-lasting devices, the AoD platform model of on-demand, multi-modal access services could become a practical everyday tool for the DHH community.

Tags: augmented reality · deaf and hard of hearing · smart glasses · captioning · sign language interpretation · assistive technology · wearable technology · real-time captioning · automatic speech recognition