Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

Non-Speech Information(also: NSI, Non-Dialogue Audio Information): Any audio content in media that is not spoken dialogue, including environmental sounds, music, sound effects, and ambient noise. Non-speech information plays a critical role in storytelling by conveying mood, indicating off-screen events, and providing contextual cues. For…
Non-Speech Sounds(also: Non-Speech Audio, Sound Effects): Auditory content in media that is not spoken dialogue, including music, environmental noises, sound effects, laughter, applause, and other ambient sounds. Non-speech sounds carry important narrative, emotional, and contextual information that contributes to a viewer's…
Non-diegetic Sound(also: Non-diegetic Audio, Extradiegetic Sound): Sound in film, television, or games that does not originate from any source within the story world and cannot be heard by the characters - for example, orchestral score, voice-over narration, or added accessibility cues. This contrasts with diegetic sound, which exists in the…
Object-Based Audio(also: OBA, Object-Based Broadcasting): An audio production and delivery paradigm in which speech, music, effects, and ambience are transmitted as discrete objects with metadata describing their role and relationships, rather than as a single mixed stream. The receiver renders the final mix, enabling per-listener…
Onomatopoeia: Words that phonetically imitate or suggest the sound they describe, such as "buzz," "crash," "swoosh," or "sizzle." In captioning, onomatopoeia is one approach to representing non-speech sounds, offering viewers a sense of the acoustic quality of a sound. However, research shows…
Open Captioning(also: Open Captions, Burned-In Captions): Captions that are permanently embedded into the video image and cannot be turned off by the viewer. Unlike closed captions, open captions are part of the visual content itself, making them visible to all viewers regardless of device or platform support. Open captions are…
Open Captions(also: Burned-in Captions, Hard-coded Captions): Captions that are permanently embedded into a video and cannot be turned off by the viewer. Unlike closed captions, which can be toggled on or off, open captions are always visible as part of the video image itself. Open captions are sometimes used when a platform does not…
Panel Transition Cue(also: Scene Transition Cue, Panel Change Signal): An auditory signal used in audio-described comics and webtoons to indicate that the narrative has moved to a new panel, scene, or page. Panel transition cues help visually impaired listeners maintain orientation within the sequential narrative structure of comics, where visual…
Parallel Viewing(also: Dual-Screen Viewing): A media consumption strategy in which viewers use a second screen alongside the primary display to access supplementary information, accessibility features, or alternative content representations without interrupting the main viewing experience. For people with disabilities,…
Participatory Captioning: A framework proposed by Nguyen et al. (2026) that characterises social media video captioning as a collaborative, community-sustained infrastructure co-produced by viewers, creators, and platforms — rather than a top-down accessibility feature delivered unilaterally.…
Picture-in-Picture(also: PiP, PIP): A display technique that shows a smaller video or content window overlaid on the main content, allowing viewers to see two sources simultaneously. In accessibility contexts, picture-in-picture is the primary method for presenting sign language interpretation in video and…
Playhead(also: Play Head, Cursor Position): The visual indicator on a video or audio timeline that shows the current playback position, typically a vertical line or triangle marker that moves in time with the media. Playheads are a core primitive of timeline-based media tools (video editors, DAWs, subtitle authoring…
Podcast(also: Podcasting): An episodic, on-demand audio programme distributed over the internet, typically via RSS or proprietary platforms such as Spotify, Apple Podcasts, and BBC Sounds. Podcasts are a dominant form of long-form audio media — 92% of UK adults listen to some audio content weekly — but…
Rapid Serial Visual Presentation(also: RSVP): A text display method in which words or short phrases are shown one at a time in a fixed location on screen in quick succession, eliminating the need for eye movements (saccades) between words. RSVP was first proposed in the 1950s for reading research and adapted for practical…
Read-Along(also: Read Along, Synchronised Highlighting, Karaoke-style Highlighting): An accessibility pattern in which on-screen text is highlighted word-by-word or phrase-by-phrase in synchronisation with spoken audio. Used in children's reading apps, language-learning tools, accessible ebook formats (e.g., EPUB Media Overlays), and podcast players.…
Respeaking(also: Speech-to-Speech Captioning, Voice Writing): A real-time captioning method in which a trained operator listens to speech and repeats it clearly into a speech recognition system optimized for their voice, producing captions. Respeaking is commonly used in broadcast television captioning and live events. It requires less…
Scene Description(also: SD, Visual Description): A textual description of the visual elements in a video scene — including objects, people, settings, actions, and visual cues — that can be converted into audio through text-to-speech technology. Scene descriptions serve as the basis for audio descriptions, making video content…
Scrubbing(also: Video Scrubbing, Timeline Scrubbing): The interaction of dragging a playhead across a video or audio timeline to preview content at arbitrary positions, typically with real-time visual or audio feedback. Scrubbing is ubiquitous in video editors, NLEs, DAWs, and subtitle-authoring tools. From an accessibility…
Sign Language Interpretation(also: Sign Language Interpreting, SLI): The process of conveying spoken or written language into a sign language (or vice versa) by a trained interpreter, enabling communication access for Deaf and hard of hearing individuals. In digital media and immersive environments, sign language interpretation is typically…
Signer Box(also: Signing Space, Sign Space): The three-dimensional space in front of a sign language user within which signs are produced, typically extending from the waist to just above the head and about an arm's width to either side. The signer box is a critical concept in sign language video production, video…
Signer Placement(also: Interpreter Placement): The spatial positioning of a sign language interpreter or signing instructor relative to instructional content in a video, videoconference, or immersive environment. Common arrangements include a side or corner window (typical in broadcast and videoconferencing), parallel…
Social Media Video Captions(also: SMVC): An umbrella term for the textual or symbolic elements — platform-generated captions, creator-edited captions, user-generated captions, and non-speech information such as sound effects, music cues, or onomatopoeia — that are temporally aligned with video content on social media…
Sonic Storytelling(also: Audio Storytelling, Sound-Based Narrative): The practice of conveying narrative, emotion, and information primarily through audio elements including narration, dialogue, sound effects, music, and spatial audio. In accessibility contexts, sonic storytelling is the approach used to make inherently visual media like comics,…
Sound Communication Technology(also: SCT): Technologies designed to communicate aspects of sound through non-auditory sensory modalities, enabling access to audio information for people who are d/Deaf or hard of hearing. Examples include closed captions (text-based), vibrating vests (haptic), spectrograms (visual…
Sound Design(also: Audio Design): The craft of creating, selecting, and arranging audio elements - dialogue, music, ambient sound, foley, and effects - to shape the experience of a film, game, broadcast, or interactive product. For accessibility, sound design is doubly important: it carries narrative and…
Sound Effect(also: SFX, Audio Effect): An artificially created or enhanced sound used to emphasize or accompany actions, events, or atmosphere in media. In accessible webtoon and comic production, sound effects are categorized into five types: environmental ambience (crowd cheering, classroom conversations),…
Sound Representation(also: Sound Depiction): The methods and conventions used to convey audio information through text in captions and other written formats. Common approaches include descriptive text (explaining the sound source and quality), onomatopoeia (words that mimic sounds), and sensory quality-focused descriptions…
Speaker Diarisation(also: Speaker Diarization, Speaker Segmentation): The automatic process of segmenting an audio recording by speaker identity — answering "who spoke when" — and labelling each segment. A critical pre-requisite for accessible transcripts of multi-voice audio such as interviews, podcasts, and meetings, since a flat transcript…
Speaker Identification(also: Speaker ID, Speaker Attribution): Methods used in captions and subtitles to indicate which person is currently speaking, enabling viewers to follow conversations among multiple participants. Common in-text speaker identification techniques include double chevrons (>>) with speaker names, different text colors…
Speech Gap(also: Dialogue Gap, Audio Gap): A pause or silence between spoken dialogue in a video or film where audio descriptions can be inserted without overlapping with the original soundtrack. Identifying speech gaps is a critical first step in audio description production, as descriptions must fit within these…
Split-Attention Effect(also: Split Attention): A cognitive load phenomenon that occurs when learners or viewers must divide their visual attention between multiple sources of information that are physically or temporally separated. In captioned media, the split-attention effect occurs when viewers must read captions while…
Subtitle(also: Subtitles, Open captions (video), Movie subtitles): On-screen text that reproduces the spoken dialogue of a video, most commonly rendered in a "movie subtitle" style (white text with a black outline, one or two lines at the bottom of the frame). Subtitles are closely related to captions but are conventionally distinguished in…
Subtitles: Text displayed on screen that represents the spoken language in audio-visual content, primarily intended for viewers who do not understand the language being spoken. While often used interchangeably with captions, subtitles and captions serve different purposes: subtitles…
Subtitles(also: Captions, Closed Captions, CC): Text displayed on screen that represents the spoken dialogue and other relevant audio information in video content. Subtitles (called captions in North America) are essential for deaf and hard of hearing viewers but are also widely used by hearing audiences in noisy…
Talking-Head Video(also: Talking Head): A common educational video format in which a presenter speaks directly to the camera, typically filling the frame, with no or few accompanying visuals. For d/Deaf and Hard-of-Hearing learners, talking-head videos are often low in useful visual content - the speaker's face must…
Temporal Agency: The degree of control a viewer has over the timing and pace of media content consumption. In accessibility contexts, temporal agency refers to the ability to slow down, pause, rewind, or otherwise adjust the temporal flow of audiovisual content to accommodate individual…
Voice Acting(also: Voice Performance, Character Voicing): The performance art of providing voices for characters, narration, and other spoken content in media such as animation, audiobooks, games, and audio-described content. In accessible media production, voice acting significantly impacts emotional engagement and…
Webtoon Accessibility: The practice of making webtoons—vertically scrolling digital comics—accessible to people with disabilities, particularly blind and low vision users. Key challenges include converting rich visual narratives into audio form while preserving emotional engagement, pacing, and…
YouDescribe: A free web platform operated by the Smith-Kettlewell Eye Research Institute that enables volunteers to crowdsource audio descriptions for YouTube videos. Viewers can request a video be described and sighted volunteers record and align AD tracks synchronised with the original…

Category

Search results