← All reviews

Deaf and hard-of-hearing users’ prioritization of genres of online video content requiring accurate captions

Larwan Berke, Matthew Seita, Matt Huenerfauth · 2020 · Proceedings of the 17th International Web for All Conference (W4A) · doi:10.1145/3371300.3383337

Summary

This paper investigates which genres of online video content Deaf and Hard-of-Hearing (DHH) users consider most important to have accurately captioned. With over 400 hours of video uploaded to YouTube every minute and no U.S. legal mandate to caption all online video (especially user-generated content), the researchers argue that prioritization is essential for directing captioning resources where they matter most to the DHH community. The study used Best-Worst Scaling (BWS), a survey methodology where participants are shown small subsets of items and asked to select only the most and least important from each subset, rather than ranking all items at once. This was the first known use of BWS with DHH participants. The research proceeded in two phases: first, an in-person validation study with 25 DHH participants (5 Deaf, 10 deaf, 10 Hard-of-Hearing; mean age 22.1) at Rochester Institute of Technology, where participants completed BWS questions, card-sorting, and three-level grouping tasks using 16 YouTube genre categories. The BWS results were statistically equivalent to card-sorting ranks for 15 of 16 genres, validating the method. Second, a large online survey with 151 DHH participants (80 Deaf, 48 deaf, 23 hard of hearing; ages 18-88, mean 39.9) from 27 U.S. states used the validated BWS instrument with both English and ASL instructions. Survey design included responsive cross-platform UI, instructional videos in ASL with English captions and transcripts, and an instructional manipulation check.

Key findings

The large-scale BWS survey produced a clear genre prioritization. The most important genres for accurate captioning were: News and Politics (BWS score 0.679, with 396 "best" votes vs. only 11 "worst"), Education (0.641), and Technology and Science (0.436). Film and Animation, Entertainment, and NonProfits fell in the middle tier. The least important genres were Games (-0.510), Animals and Pets (-0.476), Sports (-0.374), and Music (-0.362). Qualitative analysis of 256 open-ended comments from the in-person study revealed why: for high-priority genres, participants cited wanting to stay informed about the world ("A lot of hearing people know whats happening everyday but its not fair that deaf people are not as aware"), needing captions for educational content and career development, and wanting to apply information to daily life. For low-priority genres, the dominant theme was that information was already visually available (40 comments) or the content was primarily visual in nature (31 comments). Bad captioning impacts included vocabulary mix-ups causing real-world consequences (19 comments), comprehension difficulties from word errors (8 comments), and timing issues (7 comments). The Split-Half Reliability score of 0.972 indicated very high inter-annotator agreement.

Relevance

This study provides the first empirically validated, community-driven prioritization of online video genres for captioning accuracy, directly useful for video platforms, content creators, and ASR researchers. The finding that News, Education, and Science/Technology are highest priority — while visually-rich genres like Sports and Games rank lowest — makes intuitive sense but had never been quantified with this rigor. For organizations allocating captioning resources, the prioritization offers evidence-based guidance. For ASR researchers, the genre rankings can inform which types of video content to include in training and evaluation datasets. The methodological contribution is equally valuable: the validated BWS approach with DHH-accessible survey design (ASL videos, responsive UI, English transcripts alongside captions) provides a replicable template for large-scale research with the DHH community. The qualitative findings also underscore that captioning quality directly impacts comprehension and real-world outcomes — vocabulary errors in cooking or how-to videos, for example, can lead to actual mistakes.

Tags: deaf and hard of hearing · captioning · video accessibility · automatic speech recognition · user research · Best-Worst Scaling