← All reviews

Support in the Moment: Benefits and use of video-span selection and search for sign-language video comprehension among ASL learners

Saad Hassan, Akhter Al Amin, Caluã de Lacerda Pataca, Diego Navarro, Alexis Gordon, Sooyeon Lee, Matt Huenerfauth · 2022 · Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '22) · doi:10.1145/3517428.3544883

Summary

This paper investigates technology to support American Sign Language (ASL) learners in comprehending challenging sign-language videos by enabling in-context dictionary lookup. The researchers conducted two studies at Rochester Institute of Technology. Study 1 interviewed 14 ASL learners (hearing students with a mean of 3.4 years of ASL study) about their challenges watching ASL videos and their current workarounds for unfamiliar vocabulary. Participants reported difficulty with dialectal variation (e.g., Black ASL), linguistic phenomena such as fingerspelling, compound signs, coarticulation, and classifiers, and different video genres from conversations to Deaf theatre. Their workarounds included pausing, backtracking, slowing video playback, using context clues, consulting English-to-ASL dictionaries, attempting ASL-to-English reverse dictionaries, Google searching descriptions of signs, and asking peers or teachers. A key frustration was needing to leave the video to use external dictionary tools, losing comprehension context. Study 2 then compared a Wizard-of-Oz prototype with integrated video-player and span-based dictionary search against a baseline condition using an existing feature-based reverse dictionary website (HandSpeak). Fifteen ASL learners (8 integrated search, 7 baseline) each translated 9 videos spanning three genres: natural conversations, educational content, and Deaf theatre/poetry performances.

Key findings

The integrated dictionary-search prototype produced significantly higher translation accuracy scores (8.03 vs. 6.67 out of 10, Mann-Whitney U=10, p=0.0424). Participants using the integrated tool also reported significantly lower mental demand, temporal demand, and frustration on NASA-TLX workload scales. The in-depth observational analysis revealed several novel interaction patterns: six participants repurposed the span-selection tool to constrain the video playhead, progressively selecting short spans to work through videos incrementally — an unintended but valuable use. Participants used dictionary search in 62 of 72 video sessions, and in 40 of those cases searched after already completing a full translation to confirm their understanding. When narrowing searches, participants progressively reduced span width from an average of 8.17 seconds down to 2.33 seconds immediately before searching. Video genre significantly affected span selection behavior — theatre/poetry videos required wider spans due to figurative language, depiction, and classifiers, while conversational videos used shorter spans. Participants struggled when the citation form of a sign in dictionary results differed from its appearance in continuous signing, particularly with compound signs and depiction. Participants suggested improvements including showing dialectal variations and contextual examples for each dictionary result.

Relevance

This research addresses a practical gap in sign language education technology that affects both Deaf and hearing communities. With nearly 200,000 students enrolled in ASL classes in the U.S. and ASL being one of the fastest-growing language courses, tools that support video comprehension have broad applicability. The finding that integrated, in-context dictionary search outperforms external tools has direct implications for designers of ASL learning platforms and video players. The observation that students repurposed span selection for playhead control suggests that video comprehension tools should support both dictionary search and controlled segment replay as first-class features. For computer vision researchers working on sign recognition, the study reveals a new task — matching signs extracted from sub-segments of continuous video rather than isolated citation-form signs — and highlights challenges including coarticulation effects and dialectal variation that recognition systems must handle. The genre-dependent differences in user behavior also suggest that adaptive educational software could detect when students are struggling and offer appropriate support.

Tags: sign language · American Sign Language · ASL · video comprehension · dictionary search · sign language learning · Wizard-of-Oz · deaf and hard of hearing · educational technology