Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

Spatialization(also: Spatialisation, Audio Spatialization, 3D Audio Spatialization): The process of rendering a sound so that it appears to originate from a specific location in three-dimensional space around the listener. Spatialization typically combines head-related transfer functions (HRTFs) to model how ears filter sound by direction, binaural or ambisonic…
Spatialized Audio(also: 3D Audio, Spatial Sound, Immersive Audio): Spatialized audio is a technology that creates the perception of sound coming from specific locations in three-dimensional space around the listener, using techniques such as head-related transfer functions (HRTFs) and binaural rendering. In accessibility, spatialized audio is…
Spatialized Sound(also: Spatial Audio, 3D Audio, Spatialized Audio): Audio that is rendered with positional information so that it appears to originate from a specific location in three-dimensional space around the listener. Spatialized sound uses techniques like head-related transfer functions (HRTFs), interaural time differences, and interaural…
Spatio-Temporal Modulation(also: STM): A rendering technique used in mid-air ultrasound haptics in which a single focal point is moved rapidly along a closed trajectory (commonly circular) while its intensity is modulated over time. When the trajectory is swept fast enough (typically tens to hundreds of Hz), the skin…
Spatiotemporal Saliency(also: Spatiotemporal Saliency Estimation, Spatio-Temporal Saliency): A computer vision technique that estimates, for each pixel in a video, how visually important it is at a given moment by combining spatial contrast (features that stand out within a frame) with temporal contrast (regions that change or move differently from their recent…
Speaker Adaptation(also: Voice Adaptation, Speaker-Adaptive Training, Voice Personalization): Speaker adaptation is the process of adjusting an existing automatic speech recognition (ASR) system — usually one trained on a large, demographically broad corpus of able-bodied speakers — to a particular individual's voice using a relatively small amount of that person's…
Speaker Diarisation(also: Speaker Diarization, Speaker Segmentation): The automatic process of segmenting an audio recording by speaker identity — answering "who spoke when" — and labelling each segment. A critical pre-requisite for accessible transcripts of multi-voice audio such as interviews, podcasts, and meetings, since a flat transcript…
Speaker Diarization(also: Speaker Segmentation): The process of partitioning an audio stream into segments according to speaker identity, determining "who spoke when" in a multi-speaker recording or conversation. Speaker diarization is important for accessibility because deaf and hard of hearing individuals need to distinguish…
Speaker Focus(also: Speaker Focus Mode, Speaker View): A video layout customization option that enlarges and centers the speaker while removing content overlays and auxiliary visual elements. Speaker Focus is designed for viewers who find pop-up graphics and overlays distracting and prefer to concentrate on the speaker body…
Speaker Identification(also: Speaker ID, Speaker Attribution): Methods used in captions and subtitles to indicate which person is currently speaking, enabling viewers to follow conversations among multiple participants. Common in-text speaker identification techniques include double chevrons (>>) with speaker names, different text colors…
Speaker Segmentation(also: Person Segmentation, Human Segmentation): The process of identifying and isolating the speaker or presenter in a video frame, separating them from the background and other visual elements. Speaker segmentation uses computer vision models to create precise masks around the speaker, enabling layout customization options…
Speaker-dependent speech recognition(also: User-adapted ASR, Personalized speech recognition): A speech recognition approach that trains or adapts its acoustic models to a specific individual's voice characteristics, rather than relying solely on general population models. For people with cognitive disabilities, dysarthria, or other speech differences, speaker-dependent…
Speaking Behavior(also: Speaker Behavior, Speech Behavior): In accessibility and HCI research, the observable communicative behaviors a speaker exhibits during conversation — including speech rate, voice intensity (loudness), articulation clarity (including hyperarticulation or over-enunciation), eye contact, gesturing, and pausing.…
Spearcon: A spearcon is a type of auditory icon created by compressing a spoken phrase until it becomes a very brief, distinctive audio cue. Unlike earcons, which use abstract musical sounds, spearcons retain a connection to the original speech, making them easier to learn and associate…
Spearman correlation(also: Spearman rank correlation, Spearman's rho): A non-parametric statistical measure of the strength and direction of the monotonic relationship between two ranked variables, ranging from -1 to +1. In accessibility evaluation research, Spearman correlation is used to assess how well automated metrics (such as Word Error Rate…
Special Category Data(also: Sensitive Personal Data): Under the GDPR, certain types of personal data that receive heightened protection due to their sensitive nature. Special categories include data revealing racial or ethnic origin, political opinions, religious beliefs, trade union membership, genetic data, biometric data, health…
Special Education(also: Special Needs Education, SPED): Educational programs, services, and instruction specifically designed to meet the unique needs of students with disabilities. Special education encompasses a range of settings from fully inclusive classrooms with support services to specialized separate schools. In India,…
Special Educational Needs(also: SEN, Special Needs Education, Special Education): An educational framework referring to children who experience difficulties in learning that require additional or different educational provision. SEN encompasses a broad range of conditions including cognitive disabilities, physical disabilities, sensory impairments, emotional…
Special Interest(also: Hyperfocus Interest, Intense Interest): A special interest is a deep, focused, and often long-lasting passion for a specific topic, activity, or subject area, commonly experienced by autistic individuals. Special interests go beyond typical hobbies in their intensity and depth of knowledge, and they can be a source of…
Special Interest Areas(also: SIAs, Circumscribed Interests, Intense Interests): Special interest areas (SIAs) refer to the intense, focused interests that are characteristic of many autistic individuals. While traditionally viewed through a deficit lens as "restricted" or "repetitive" behaviours, strengths-based approaches recognize SIAs as powerful…
Special Interests(also: Restricted Interests, Intense Interests): Special interests are focused, intense, and often enduring areas of passion commonly observed in autistic children and adults - such as trains, dinosaurs, specific cartoon characters, or numerical systems. Once framed deficit-wise in diagnostic criteria as "restricted…
Specific Language Impairment(also: SLI, Developmental Language Disorder): A neurodevelopmental condition characterised by significant difficulties in acquiring and using language that cannot be attributed to hearing loss, intellectual disability, neurological damage, or environmental deprivation. Children with specific language impairment may have…
Specific Learning Disability(also: SLD, Learning Disability, Learning Disorder): A neurodevelopmental disorder that affects the brain's ability to receive, process, store, or respond to information, resulting in significant difficulties with reading, writing, or mathematics that are not attributable to intellectual disability, sensory impairment, or lack of…
Spectrogram(also: Sonogram, Spectral Display): A spectrogram is a visual representation of the frequency spectrum of a signal as it varies over time, typically showing time on the horizontal axis, frequency on the vertical axis, and intensity represented by color or brightness. In speech science and accessibility research,…
Speculative Design(also: Design Fiction, Critical Design): A design approach that uses conceptual proposals and provocative artifacts to explore possible futures, challenge assumptions, and stimulate debate rather than solve immediate practical problems. In accessibility research, speculative design is used to imagine alternative…
Speech Acts Theory(also: Speech Act Theory, Illocutionary Acts): A theory from the philosophy of language, originally developed by J.L. Austin and John Searle, which holds that utterances are not just statements of fact but also actions that accomplish things — such as requesting, promising, warning, or commanding. In assistive technology and…
Speech Composer(also: Speech Generation, Message Composition Engine): A software component in AAC (Augmentative and Alternative Communication) systems that takes user input — whether typed text, selected symbols, or telegraphic phrases — and processes it for spoken output through a text-to-speech synthesiser. Advanced speech composers may include…
Speech Delay(also: Language Delay, Delayed Speech): A condition in which a child does not develop speech and language skills at the expected rate for their age. Speech delay can affect the production of sounds (articulation), the ability to form words and sentences (expressive language), or the understanding of language…
Speech Dialogue Design(also: Speech Interface Design, Auditory Dialogue Design): The practice of designing the structure, content, ordering, and delivery of information presented through synthetic speech in computer interfaces. Effective speech dialogue design considers psycholinguistic principles such as the recency effect (items heard last are best…
Speech Disfluency(also: Disfluent Speech, Non-Fluent Speech): Any interruption to the normal flow of speech, including repetitions of sounds or words, prolongations of sounds, blocks (involuntary pauses), interjections, and revisions. While everyone experiences occasional disfluency, persistent speech disfluency conditions such as…
Speech Diversity(also: Diverse Speech, Non-Typical Speech): The full range of ways human speech varies from the narrow 'typical' speech on which most speech-AI systems are trained and benchmarked. Speech diversity includes people who stutter, d/Deaf and Hard-of-Hearing speakers, people with dysarthria, aphasia, or other neurological…
Speech Emotion Recognition(also: SER, Vocal Emotion Recognition): A class of machine-learning techniques that infers a speaker's emotional state from acoustic features of speech — pitch contour, intensity, rhythm, spectral properties, voice quality — usually producing a label (happy/sad/angry/calm) or continuous values on valence and arousal…
Speech Enhancement(also: Voice Enhancement, Speech Clarity Improvement): Audio processing techniques that improve the clarity, intelligibility, and quality of speech in audio or video content. Speech enhancement can involve removing background noise, extending audio bandwidth, normalizing volume levels, and improving articulation clarity. For viewers…
Speech Error(also: Articulation Error, Pronunciation Error): A deviation from the expected or standard production of speech sounds, including substitutions, omissions, additions, and distortions of phonemes. Speech errors are common among people who are deaf or hard of hearing, as limited auditory feedback makes it difficult to monitor…
Speech Gap(also: Dialogue Gap, Audio Gap): A pause or silence between spoken dialogue in a video or film where audio descriptions can be inserted without overlapping with the original soundtrack. Identifying speech gaps is a critical first step in audio description production, as descriptions must fit within these…
Speech Generating Device(also: SGD, Voice Output Communication Aid, VOCA): An electronic device used in augmentative and alternative communication that produces speech output, either through pre-recorded messages or text-to-speech synthesis. Speech generating devices range from dedicated hardware devices (like the Accent1400) to software applications…
Speech Impairment(also: Speech Disability, Communication Disability): A condition affecting the ability to produce speech sounds or to communicate verbally. Speech impairments range from mild articulation difficulties to complete inability to speak, and may be caused by neurological conditions, physical injuries, developmental conditions, or…
Speech Input(also: Voice input, Voice control, Speech recognition input): An input method that allows users to control devices or enter text by speaking rather than using manual touch or keyboard input. Speech input is particularly important for people with visual impairments, who use it significantly more often than sighted users to overcome the…
Speech Intelligibility(also: Speech Recognition Score, Word Recognition): A measure of how well speech can be understood by a listener, typically expressed as the percentage of words or sentences correctly identified under specific listening conditions. Speech intelligibility is affected by factors including audio bandwidth, background noise, signal…
Speech Language Model(also: SLM, Audio Language Model, Speech Foundation Model): A class of large neural models that processes both speech and text in a single end-to-end framework, integrating tasks — automatic speech recognition, spoken language understanding, dialogue, speech generation — that traditionally required separate modular systems. Examples…
Speech Language Pathologist(also: SLP, Speech Therapist, Speech-Language Therapist): A licensed healthcare professional who specialises in the assessment, diagnosis, and treatment of communication disorders, including speech, language, voice, fluency, and swallowing difficulties. In accessibility and disability contexts, SLPs play a critical role in supporting…
Speech Neuroprosthesis(also: Speech BCI, Speech Brain-Computer Interface): A brain-computer interface that decodes neural activity associated with attempted or imagined speech and converts it into text, synthesized voice, or both. Speech neuroprostheses are designed for people with anarthria or severe dysarthria from ALS, brainstem stroke, locked-in…
Speech Output(also: Auditory Feedback, Spoken Feedback): Speech output refers to the use of synthesised or pre-recorded human voice to convey information from a computer system or device to a user. In accessibility contexts, speech output is a primary means of making visual interfaces accessible to blind and visually impaired users,…
Speech Prosodics(also: Prosodic Features, Suprasegmental Features): Speech prosodics refers to the nonverbal acoustic features of speech that convey meaning beyond the words themselves, including pitch (fundamental frequency), rhythm, stress, intonation patterns, pausing, and speaking rate. In accessibility research, prosodic analysis serves as…
Speech Rate(also: Speaking Rate, Articulation Rate): The speed at which speech is produced, typically measured in words per minute (WPM) or syllables per second. Normal conversational speech ranges from 120-180 WPM, while screen reader users often configure synthetic speech at rates of 300-400 WPM or higher. Speech rate settings…
Speech Reading(also: Lip Reading, Lipreading, Visual Speech Perception): The practice of understanding speech by visually interpreting a speaker's lip movements, facial expressions, gestures, and body language. Speech reading is used by many Deaf and Hard-of-Hearing individuals as a communication strategy, often in combination with residual hearing…
Speech Recognition(also: Voice Recognition, STT, Speech-to-Text): Technology that converts spoken language into text or commands by analyzing audio input. Speech recognition powers dictation systems, voice assistants, and voice-controlled interfaces. For accessibility, speech recognition enables text input and device control for users who…
Speech Repair(also: Self-Correction, Speech Self-Repair, Command Correction): Speech repair is the process of correcting or modifying a spoken utterance after it has been produced, either within the same turn or in a subsequent one. In natural conversation, speakers commonly interrupt themselves to fix errors, change wording, or update information using…
Speech Rule Engine(also: SRE): An open-source JavaScript library that generates speech and Braille output for mathematical expressions given in presentation MathML. The Speech Rule Engine performs semantic interpretation of mathematical formulas — analyzing symbols, determining operator scope, and building…
Speech Sound Disorder(also: SSD, Speech Disorder, Articulation Disorder): A communication disorder affecting the development of accurate speech sound and prosody production in childhood. Children with SSDs struggle with phonological representation, phonological awareness, and print awareness, which can lead to difficulties learning to read and impact…

Category

Search results