I See What You're Saying: A Literature Review of Eye Tracking Research in Communication of Deaf or Hard of Hearing Users

Chanchal Agrawal, Roshan L Peiris · 2021 · Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '21) · doi:10.1145/3441852.3471209

Summary

This paper presents a comprehensive literature review of 55 eye-tracking studies examining the gaze patterns and communication behaviours of deaf or hard-of-hearing (DHH) individuals. The authors systematically searched ACM, IEEE, Google Scholar, and linguistics and psychology databases to identify research exploring how DHH people use their visual senses for communication. The review is organised around two primary themes: Communication Mode and Context. Communication Mode is subdivided into Visual Communication (sign language, facial expression, animation, and lip-reading — 32 papers) and Text-based Communication (captions and subtitles — 11 papers). Context covers Classroom environments, Attention Management strategies, and Prototypes designed for DHH users (12 papers). The paper establishes that DHH individuals have heightened visual cognition and employ distinct gaze strategies compared to hearing individuals. This enhanced visual capacity stems from cross-modal cortical reorganisation, where auditory brain regions are repurposed for visual processing, resulting in improved motion detection, peripheral vision, and faster reaction times to peripheral stimuli. The review traces eye-tracking research in this domain from the late 1990s to 2020, noting a significant increase in publications from 2011 onward. The most commonly used eye-tracking devices were those from SMI (16 studies), Tobii (12 studies), and EyeLink (6 studies). The authors use this survey to synthesise findings across decades of research and identify gaps and future research directions.

Key findings

The review reveals several consistent patterns in DHH gaze behaviour. During sign language communication, DHH individuals primarily fixate on the face — particularly the eyes — while perceiving hand movements, gestures, and finger-spelling in peripheral vision. Beginner signers focus on the mouth region for lip-reading supplementation, whereas native signers focus on the eye region to access grammatical and referential information. Non-manual markers like eye blinks, head nods, and brow movements serve critical linguistic functions in sign language, marking phrase boundaries, conveying emphasis, and regulating turn-taking. For text-based communication, skilled deaf readers demonstrate a larger perceptual span (up to 18 letter spaces) compared to hearing readers (up to 14), and they skip more words, re-fixate fewer, and regress less often. This suggests strong orthographic-semantic processing that compensates for limited phonological access. Captions and subtitles, while beneficial, compete with visual content for attention — DHH viewers spend more time on visual content than textual, while hearing viewers show the opposite pattern. In classroom settings, DHH students focused primarily on the interpreter (rather than the instructor or slides), with only 10% attention on the instructor and 14% on slides, compared to hearing students who directed 74% of attention to the instructor. This attention-splitting challenge significantly impacts learning outcomes for DHH students.

Relevance

This review provides essential insights for anyone designing accessible communication technologies for DHH users. The finding that DHH individuals rely on facial region focus during sign language has direct implications for video communication platforms — the face must be rendered at the highest resolution and temporal fidelity, while peripheral elements like hands and body can be compressed. For caption and subtitle design, the research demonstrates that placement, font size, colour contrast, and editing approach all significantly affect comprehension, with real-time captioning preferred over Automatic Speech Recognition in classroom settings. The classroom findings are particularly actionable: reference cues derived from eye-tracking data improved DHH students' slide fixation from 14% to 16%, suggesting that attention-guiding technologies could meaningfully improve educational accessibility. The authors propose several future directions including video coding optimisation for sign language videos, accessible interface design for remote learning, and real-time eye-tracking to create interactive captions. For practitioners, this paper is a valuable reference for understanding the visual attention strategies that should inform the design of any DHH-facing communication tool.

Tags: deaf and hard of hearing · eye tracking · eye gaze · sign language · lip-reading · captions · attention management · visual communication · literature review