Deaf and Hard of Hearing Access to Intelligent Personal Assistants: Comparison of Voice-Based Options with an LLM-Powered Touch Interface

Paige S DeVries, Michaela Okosi, Ming Li, Nora Dunphy, Gidey Gezae, Dante Conway, Abraham Glasser, Raja Kushalnagar, Christian Vogler · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791869

Summary

This mixed-methods study compares three input methods for Deaf and Hard of Hearing (DHH) people who use their voice to interact with an Amazon Echo Show: (1) natural deaf-accented speech via Alexa's built-in ASR, (2) Wizard-of-Oz 'facilitated English' where a trained human listener re-speaks commands to Alexa in real time, and (3) an LLM-assisted touch interface on a Fire HD tablet that pre-builds Alexa commands through a 3x3 button grid populated by ChatGPT-4o from the user's history and smart-home context. Twenty DHH participants (10 Deaf, 2 Deaf-Blind, 8 Hard of Hearing) completed 10 predefined smart-home tasks per condition in a simulated apartment with two Philips Hue smart lights, then filled out the System Usability Scale (in English or ASL-SUS), Adjective Scale, and Net Promoter Score for each condition, and participated in semi-structured interviews analysed with Braun and Clarke thematic coding by two ASL-English bilingual coders. Word Error Rate was computed for the deaf-speech condition using SCLite against Alexa's recognition logs. The study deliberately targets a population often excluded from IPA accessibility research - non-signing (or speech-using) DHH people - and positions itself against prior ASL-input work by Tran et al. and DeVries et al., using comparable metrics to enable direct cross-study comparison.

Key findings

No statistically significant differences in SUS across the three conditions: LLM Touch scored highest at 63.5 (SD 20.8), Facilitated English 62.5 (SD 22.6), Natural Deaf Speech 59.6 (SD 15.9). Half of participants had zero recognition errors with natural deaf speech, and another quarter had WER under 10% - surprisingly high given the general state of deaf-accented ASR, though two participants had 100% WER and could not get past the wake word. Observed NPS was much higher than SUS-predicted NPS for Natural Deaf Speech (-5 observed vs -30 predicted), suggesting participants were enthusiastic about voice working at all, even when usability was marginal. Facilitated English showed the widest SUS spread - some loved the accuracy, others were frustrated by 5-second re-speaking latency and opacity around the human-in-the-loop. LLM Touch was praised for reliability but criticised for 8-10 second round-trip latency, limited grid options, and unpredictable LLM-generated choices. 45% called touch the easiest method; 35% chose natural deaf speech. Over half rated touch as a poor alternative to speech overall. Participants repeatedly raised hands-free needs (cooking, driving, emergencies) and expressed strong interest in native ASL recognition and eye-tracking or gesture input.

Relevance

For practitioners building voice or multimodal assistants, the central takeaway is that there is no shortcut around making ASR understand deaf-accented and dysarthric speech natively - neither human-in-the-loop re-speaking (too slow and socially awkward) nor LLM-assisted touch (too latent and constrained) is an adequate substitute for voice input, even though both are technically feasible today. The LLM touch architecture (action verb -> context-aware option grid -> follow-up prompt -> final command) is a reusable pattern for command-and-control surfaces where full free-form typing is too slow. The reported latencies (2s natural speech, 5s facilitated, 8-10s LLM touch) are concrete targets for engineering teams. Limitations to keep in mind: the Wizard-of-Oz facilitator represents a best-case re-speaker that exceeds commercially available tools, the sample was drawn from Gallaudet/DC and is not representative of all DHH speech patterns, and every participant used some ASL in daily life - findings may differ with non-signing DHH users elsewhere. The open-source LLM Touch code on GitHub lowers the barrier for follow-up studies and product integration.

Tags: deaf and hard of hearing · voice assistant · intelligent personal assistant · automatic speech recognition · deaf-accented speech · large language model · accessibility · smart home