Artificial Intelligence Fairness in the Context of Accessibility Research on Intelligent Systems for People Who Are Deaf or Hard of Hearing

Sushant Kafle, Abraham Glasser, Sedeeq Al-khazraji, Larwan Berke, Matthew Seita, Matt Huenerfauth · 2020 · SIGACCESS Accessibility and Computing · doi:10.1145/3386296.3386300

Summary

This paper from RIT's Center for Accessibility and Inclusion Research discusses AI fairness issues specifically through the lens of the authors' extensive research on intelligent systems for people who are Deaf or Hard of Hearing (DHH). The authors identify five interconnected challenges. First, the need for inclusion of DHH data in training sets: ASR systems do not work well for DHH users' speech because DHH voices are absent from training data, and emotion detection systems misidentify anger when analysing faces of sign language users because facial expressions serve grammatical functions in ASL. Second, the lack of interpretability of AI systems: the black-box nature of deep learning makes it harder for marginalized users to provide input and oversight on deployment decisions, and there is a danger that cost-saving decision-makers will deploy imperfect AI captioning to replace human interpreters and captionists prematurely, reducing service quality for DHH people. The authors recount how the World Federation of the Deaf and World Association of Sign Language Interpreters issued a joint statement in 2018 cautioning against premature deployment of sign language avatar technology, illustrating the tension between research potential and deployment readiness. Third, researchers have ethical responsibilities to ensure the capabilities of their AI systems are communicated honestly, particularly given media tendencies to exaggerate AI performance.

Key findings

Fourth, the paper argues that standard evaluation metrics for AI systems may not align with the needs of disabled users. In their captioning research, the authors found that Word Error Rate (WER) — the standard ASR metric — correlated poorly with DHH users' actual comprehension and satisfaction with captions. They developed an alternative metric that better predicted DHH user opinions and advocated for its adoption in the ASR community. This illustrates a broader problem: when AI research optimizes for standard metrics, it may optimize toward results that do not serve disabled users' real needs. Fifth, the authors identify a novel concern about human-AI interaction: AI systems change human behaviour. When ASR-based automatic captioning was deployed during DHH-hearing conversations, hearing speakers changed their speech patterns — speaking louder, faster, and with non-standard articulation — potentially degrading ASR performance in the very context it was designed for. More broadly, the paper argues that AI creates new "ability requirements": to use a voice assistant you must produce recognizable speech, to be detected by autonomous vehicles you must look like a typical pedestrian, and to pass AI interview screening you must produce expected facial expressions and vocal patterns. These requirements create new social barriers for disabled people.

Relevance

This paper is particularly valuable because it draws on years of concrete research experience building AI systems for DHH users rather than theorising about potential harms. The finding that standard evaluation metrics misalign with disabled users' needs is actionable for any team developing AI accessibility tools — it calls for developing and validating disability-specific metrics before deploying systems. The behavioural change observation (hearing people altering speech when captioning is present) reveals an underexplored feedback loop in human-AI interaction that has implications for any accessibility technology that mediates communication between disabled and non-disabled people. The concept that AI systems create new "ability requirements" provides a powerful framing: each new AI-mediated interaction defines what a body must be able to do to participate, systematically excluding people whose bodies, speech, or behaviours fall outside narrow algorithmic expectations. For the DHH community specifically, the tension between imperfect-but-available AI captioning and high-quality-but-scarce human interpretation remains unresolved and has significant implications for education, healthcare, and employment access.

Tags: AI fairness · deaf and hard of hearing · automatic speech recognition · captioning · evaluation metrics · sign language avatars · ethical responsibility · human-AI interaction · behavioral change

Standards referenced: ACM Code of Ethics