Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

AI Proxy(also: AI Proxying): A design pattern in which an AI system acts on a user's behalf within a social, communicative, or interpretive setting — for example, generating a facial expression, voice, or written reply that represents the user to others — rather than merely assisting the user with a…
AI-Generated Content(also: AIGC): An umbrella term for text, images, audio, video, and other media produced by generative AI systems — especially large language models and diffusion-based text-to-image or text-to-video models — in response to user prompts. AIGC is widely used in creative tooling (backdrop…
AI-Generated Speech(also: Synthetic Speech, AI Speech): Speech audio produced by artificial intelligence systems — typically neural text-to-speech or voice cloning models — rather than recorded from a human speaker. Deaf and hard-of-hearing content creators increasingly use AI-generated speech to add spoken-language tracks to signed…
Chain-of-Thought(also: CoT, Chain of Thought Reasoning, Step-by-Step Reasoning): Chain-of-thought is a prompting and model-design technique in which a large language model produces its intermediate reasoning steps before giving a final answer. Modern reasoning models (e.g., OpenAI o-series, Claude thinking modes) expose chain-of-thought as visible internal…
ChatGPT(also: GPT, OpenAI ChatGPT): ChatGPT is a conversational generative AI assistant developed by OpenAI, based on the GPT family of large language models. Users interact via a text chat interface and, in newer versions, through voice, image, and file upload. ChatGPT is widely used as an accessibility tool —…
Computer-Using Agent(also: CUA): An AI agent, typically built on a Large Multimodal Model, that perceives a computer's graphical user interface through screenshots, reasons about on-screen context, and directly manipulates the interface by clicking, typing, scrolling, and navigating between applications. Unlike…
DALL-E(also: DALL-E 2, DALL-E 3, DALLE): A family of text-to-image generative AI models developed by OpenAI that produces images from natural-language prompts. DALL-E models are widely used by content creators, including people with disabilities, to generate visuals without photography or illustration skills, but they…
ElevenLabs: A commercial AI voice platform that generates realistic synthetic speech and voice clones from text. ElevenLabs is used in accessibility contexts for producing narrated video voiceovers, audiobook-style readings, and personalized text-to-speech voices, and it has been adopted in…
Human-AI Co-Creation(also: Human-AI Co-Creative, Co-Creative AI, Mixed-Initiative Co-Creation): Human-AI co-creation refers to creative work in which a person and an AI system iteratively contribute to the same artifact, with each shaping the other's next move rather than the AI acting as a one-shot tool. In accessibility contexts, co-creative systems are used to scaffold…
LoRA(also: Low-Rank Adaptation): A parameter-efficient fine-tuning technique, introduced by Hu et al. in 2022, in which a large pretrained neural network is specialised by training only a pair of small low-rank matrices that modify specific weight projections, while the original weights remain frozen. LoRA…
Microsoft Copilot(also: Copilot, Microsoft 365 Copilot, Copilot in Excel): Microsoft Copilot is a family of generative AI assistants integrated into Microsoft 365 applications including Excel, Word, PowerPoint, Outlook, and Teams, as well as GitHub and Windows. In Excel and Google Sheets-style workflows, Copilot lets users describe spreadsheet…
Momentous Depiction: A conceptual framework proposed by Niu, Clements, and Kim (2026) for using generative AI to visualize critical moments that convey the insights and meanings of disability in storytelling videos. The framework identifies four core GenAI affordances that support or constrain…
Music GenAI(also: Generative Music AI, AI Music Generation): Generative AI systems that produce musical output — melodies, full songs, instrumental accompaniment, or vocal tracks — from text prompts, seed audio, or structured parameters. Examples include Suno, Udio, MusicLM, and MusicGen. In accessibility and therapy contexts, music GenAI…
Reassurance Robot: A term coined by Grace Barkhuff (CHI 2026) to describe generative AI systems — such as ChatGPT — that, by default, provide reassurance, confession-hearing, and decision-making on demand, thereby accommodating the compulsions of people with Obsessive-Compulsive Disorder (OCD).…
Stable Diffusion: An open-weights latent text-to-image diffusion model released by Stability AI in 2022. It operates by iteratively denoising a random latent tensor, conditioned on text embeddings produced by a frozen CLIP encoder, until the latent can be decoded by a VAE into a coherent image.…
Story Completer: A design role for generative AI in storytelling, proposed by Niu, Clements, and Kim (2026), in which AI systems complete and enrich stories authored by human creators rather than generating full storylines or automating creative decisions. The concept is framed in contrast to AI…
Suno(also: Suno AI, Suno v3.5): A commercial generative AI platform that produces full songs — lyrics, vocals, instrumentation — from short natural-language prompts specifying genre, mood, tempo, and lyrical content. Suno is widely adopted in HCI research on music co-creation, journaling, and therapy because…
Text-to-Audio(also: Text-to-Audio Generation, TTA): A class of generative AI models that synthesise non-speech sound (environmental sounds, sound effects, music stems) from a text prompt - for example producing the sound of 'leaves rustling in wind' or 'church bells ringing'. Distinct from text-to-speech, which produces spoken…
Text-to-Image Model(also: T2I Model, T2I, Text-to-Image Generator): A generative AI system that produces images from natural-language prompts. Prominent examples include DALL-E, Stable Diffusion, and Midjourney. In accessibility contexts, text-to-image models have been shown to replicate and amplify disability stereotypes — for example,…
Text-to-Sound(also: Text-to-Audio, TTA, Sound Generation from Text): A class of generative AI models that synthesize non-speech audio - sound effects, ambient environments, foley, or short music clips - from a natural-language description such as 'a door creaking shut' or 'cloth ruffling as a coat is removed'. Distinct from text-to-speech, which…
Text-to-Video(also: T2V, Text-to-Video Generation): A class of generative AI models that produces short video clips from natural-language prompts (and sometimes reference images). Examples at the time of writing include Runway Gen, OpenAI Sora, Google Veo, and Pika. For accessibility, text-to-video raises both opportunities —…
Voice Cloning(also: Voice Synthesis Cloning, Personalized Text-to-Speech): The use of machine-learning models to synthesise a target speaker's voice from a short reference recording, enabling text-to-speech output that sounds like that specific person. For accessibility, voice cloning has transformative potential: people whose voices are at risk of…

22 results.

Category

Search results