Beyond Subtitles: Captioning and Visualizing Non-speech Sounds to Improve Accessibility of User-Generated Videos
Oliver Alonzo, Hijung Valentina Shin, Dingzeyu Li · 2022 · Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '22)
This paper investigates a significant gap in video captioning: the representation of non-speech sounds. While automatic speech recognition (ASR) has become widely available for generating captions on platforms like YouTube, TikTok, and Zoom, these systems focus exclusively on…
deaf and hard of hearing · captioning · non-speech sounds · automatic captions · video accessibility