Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

Scene Change Detection(also: Shot Boundary Detection, Scene Transition Detection): An automated technique for identifying transitions between different scenes or shots in video content by analyzing visual differences between consecutive frames. In audio description workflows, scene change detection helps determine optimal moments for inserting descriptions, as…
Scene Segmentation(also: Scene Detection, Shot Boundary Detection): Scene segmentation is the process of automatically dividing a video into discrete scenes or segments based on visual changes such as cuts, transitions, or the appearance of new elements in the frame. In the context of accessibility, scene segmentation is a foundational component…
Signer(also: Sign Language User, Signing Person): A person who communicates using sign language. In accessibility contexts, signers may be deaf, hard of hearing, or hearing individuals (such as interpreters, children of deaf adults, or others who have learned sign language). When creating accessible video content, signers…
Signer Box(also: Signing Space, Sign Space): The three-dimensional space in front of a sign language user within which signs are produced, typically extending from the waist to just above the head and about an arm's width to either side. The signer box is a critical concept in sign language video production, video…
Silent Gap Detection(also: Silence Detection, Audio Gap Detection): An automated technique for identifying periods of silence or absence of speech in audio tracks, used in audio description workflows to find natural insertion points for descriptions. Silent gap detection distinguishes between complete silence (no sound at all) and non-speech…
Social Media Video Captions(also: SMVC): An umbrella term for the textual or symbolic elements — platform-generated captions, creator-edited captions, user-generated captions, and non-speech information such as sound effects, music cues, or onomatopoeia — that are temporally aligned with video content on social media…
Sound Design(also: Audio Design): The craft of creating, selecting, and arranging audio elements - dialogue, music, ambient sound, foley, and effects - to shape the experience of a film, game, broadcast, or interactive product. For accessibility, sound design is doubly important: it carries narrative and…
Spatiotemporal Saliency(also: Spatiotemporal Saliency Estimation, Spatio-Temporal Saliency): A computer vision technique that estimates, for each pixel in a video, how visually important it is at a given moment by combining spatial contrast (features that stand out within a frame) with temporal contrast (regions that change or move differently from their recent…
Speech Gap(also: Dialogue Gap, Audio Gap): A pause or silence between spoken dialogue in a video or film where audio descriptions can be inserted without overlapping with the original soundtrack. Identifying speech gaps is a critical first step in audio description production, as descriptions must fit within these…
Subtitle(also: Subtitles, Open captions (video), Movie subtitles): On-screen text that reproduces the spoken dialogue of a video, most commonly rendered in a "movie subtitle" style (white text with a black outline, one or two lines at the bottom of the frame). Subtitles are closely related to captions but are conventionally distinguished in…
Subtitles(also: Captions, Closed Captions, CC): Text displayed on screen that represents the spoken dialogue and other relevant audio information in video content. Subtitles (called captions in North America) are essential for deaf and hard of hearing viewers but are also widely used by hearing audiences in noisy…
Synthesized Video Description(also: TTS Video Description, Text-to-Speech Description, Synthesized Audio Description): An audio description for video content that is generated using text-to-speech (TTS) technology rather than recorded by a human narrator. A describer writes a text script describing the visual elements of a video, and speech synthesis software converts this text into spoken…

12 results.

Category

Search results