Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

AAC Corpus(also: AAC Text Corpus, Augmentative Communication Corpus): A collection of text produced by or representative of Augmentative and Alternative Communication (AAC) device users, used for training and evaluating language models and word prediction systems. AAC corpora are notoriously difficult to assemble because AAC users produce text…
Aspect-Based Sentiment Analysis(also: ABSA, Aspect-Level Sentiment Classification): A natural language processing technique that identifies both the specific topics or aspects being discussed in text (such as food quality, customer service, or pricing in a restaurant review) and the sentiment expressed about each aspect (positive, negative, or neutral). Unlike…
Atomic Facts(also: Atomic Claims): Self-contained units of information extracted from longer text, each representing a single verifiable claim or observation. In AI reliability research, decomposing model responses into atomic facts enables systematic comparison of what different models agree or disagree about.…
Attention Mechanism(also: Attention): A technique in neural networks that allows models to focus on relevant parts of the input when generating each part of the output, rather than relying solely on a fixed-length context vector. In sequence-to-sequence models, attention computes a weighted combination of all…
Automated Readability Scoring(also: ARSS, Automated Readability Scoring System, Readability Assessment): The use of computational methods to automatically evaluate the reading difficulty level of a text. Traditional readability formulas like Flesch-Kincaid and Dale-Chall use surface features such as average sentence length, word length, and vocabulary frequency to assign…
Automatic Readability Assessment(also: Readability Prediction, Reading Level Assessment): The computational task of predicting how difficult a text is for a reader, usually expressed as a grade level or a readability score. Modern systems treat readability as a machine-learning classification or regression problem that combines shallow surface features (sentence…
Automatic Text Simplification(also: ATS, Automated Simplification): The use of computational methods to reduce the complexity of text while preserving its meaning, making it more accessible to readers with disabilities or limited literacy. Automatic text simplification includes lexical simplification (replacing difficult words with simpler…
BERT(also: Bidirectional Encoder Representations from Transformers): A natural language processing model developed by Google that uses bidirectional training to understand context from both directions in a sentence. BERT and its variants like SBERT (Sentence-BERT) are increasingly used in accessibility applications for tasks such as automatic…
BLEU Score(also: BiLingual Evaluation Understudy, BLEU): A metric for evaluating the quality of machine-generated text by comparing it to one or more reference (human-written) translations. BLEU calculates precision by counting how many n-grams (sequences of words) in the predicted text match n-grams in the reference text, with BLEU-1…
Chain-of-Thought Prompting(also: CoT Prompting): A technique for improving the reasoning capabilities of large language models by instructing them to break down complex tasks into intermediate reasoning steps before producing a final answer. In accessibility applications, chain-of-thought prompting is used to improve the…
Chart Question Answering(also: Chart QA, ChartQA, Visual Question Answering for Charts): The task of answering natural-language questions about a data visualization, typically a chart provided as an image or structured specification. A chart question answering system must identify the chart type, extract the underlying data, interpret axes and legends, and answer…
Clue and Reasoning Prompting(also: CARP, Clue-and-Reasoning Prompting): A prompt engineering strategy for large language models that instructs the model to first identify textual clues (keywords, phrases, contextual information) in the input and then perform diagnostic reasoning based on those clues before producing a classification output.…
Coh-Metrix: A web-based tool developed at the University of Memphis that analyses text on more than a hundred measures of language, cohesion, and readability, including referential and semantic cohesion, lexical diversity, syntactic complexity, and latent semantic analysis. Coh-Metrix moves…
Content Summarization(also: Text Summarization, Automated Summarization): The process of condensing longer text content into shorter, focused summaries that capture the essential information. In accessibility contexts, content summarization addresses the information overload that screen reader users face when navigating verbose or redundant web…
Controlled Language(also: Controlled Natural Language, CL): An explicitly defined restriction of a natural language that specifies constraints on vocabulary, grammar, and style to improve clarity, consistency, and machine processability of text. In accessibility, controlled language rules can be applied to improve the quality of content…
Coreference(also: Coreference Resolution, Anaphora Resolution): The linguistic phenomenon of two or more expressions in a text referring to the same real-world entity — for example, "Sam", "she", and "the scientist" all referring to the same person. Coreference resolution is the NLP task of automatically linking these expressions into…
Corpus(also: Language Corpus, Text Corpus, British National Corpus): A corpus is a large, structured collection of texts used to train, tune, or evaluate language-processing systems. Representative examples include the British National Corpus (BNC, 100 million words of British English), the Penn Treebank, and more recently Common Crawl and…
Data-to-Text(also: Data-to-Text Generation, Data-to-Text NLG): A subfield of natural language generation (NLG) that automatically produces human-readable text from structured data, such as databases, spreadsheets, or sensor readings. Data-to-text systems analyze input data to identify patterns, trends, and salient features, then generate…
Dialog Act(also: Dialogue Act, Speech Act): A classification label representing the communicative intention behind a spoken or written utterance in a conversational system. In the context of accessible technology, dialog acts are used to interpret what a user wants to accomplish when issuing voice commands — for example,…
Diphone(also: Diphone Synthesis): A unit of speech used in text-to-speech synthesis, consisting of the transition from the middle of one phoneme to the middle of the next. Diphone-based synthesis works by recording a set of all possible phoneme-to-phoneme transitions in a language and concatenating the…
Direct Machine Translation(also: Direct MT, Dictionary-Based Machine Translation): The simplest machine-translation paradigm: source-language words are translated into target-language words using a bilingual dictionary, with limited or no syntactic analysis and only shallow reordering heuristics. Direct MT is cheap to build and always produces some output, but…
Directional Stimulus Prompting(also: DSP): A prompt engineering technique for large language models that provides specific keywords or directional stimuli to guide the model toward generating output focused on particular aspects or attributes. In accessibility applications, DSP is used to produce targeted,…
Entity Density(also: Entity-Density Features): A discourse-level readability feature measuring how many distinct entities — named entities (people, places, organisations) and general nouns — a text introduces per sentence or document. High entity density increases working-memory load on readers because each new entity must…
Entity Grid(also: Entity-Grid Model): A model of local text coherence proposed by Barzilay and Lapata (2008) that represents a document as a two-dimensional grid: rows are sentences, columns are salient entities, and each cell records the grammatical role of that entity in that sentence (subject, object, other, or…
Error Taxonomy(also: Error Classification, Error Typology): A systematic classification of the types of errors that users or learners commonly make, organised into categories based on the nature, source, or linguistic level of the error. In accessibility and educational technology, error taxonomies are used to build intelligent systems…
Error-spread modelling(also: Error propagation modelling, Error radiation): An approach to evaluating the impact of speech recognition errors that accounts for how a single misrecognized word degrades comprehension of its neighbouring words, not just the word itself. For example, misrecognizing "kitchen" as "kitten" makes the subsequent word "area"…
Evocation(also: Word Association Strength, Semantic Evocation): A measure of how strongly one word brings another word to mind, reflecting the associative connections between concepts in human semantic memory. Unlike formal semantic relationships such as synonymy or hyponymy, evocation captures the informal, often idiosyncratic associations…
Extractive Summarization(also: Extractive Text Summarization): Extractive summarization is a natural language processing technique that creates summaries by selecting and preserving key words, phrases, or sentences directly from the original text, rather than generating new text (which is called abstractive summarization). In accessibility…
Few-Shot Prompting(also: In-Context Learning, Few-Shot Learning): A technique for guiding large language models by providing a small number of examples within the input prompt to demonstrate the desired task or output format. In accessibility applications, few-shot prompting can help AI systems perform context-specific tasks like correcting…
Fluency(also: Text fluency, Grammatical fluency): In natural language processing and text simplification, fluency is the degree to which a piece of text is grammatically correct and reads naturally in the target language. It is one of three standard evaluation dimensions for automatic text simplification alongside complexity…
Gold-Standard Evaluation(also: Gold Standard, Reference Standard Evaluation): An evaluation methodology in natural language processing and generation where system output is compared against a set of pre-established correct or ideal responses. In text-based systems, gold-standard strings are human-produced reference outputs that serve as benchmarks.…
Goodness of Pronunciation(also: GOP, GOP Score): A computational measure used in automatic speech recognition to assess how closely a spoken utterance matches expected pronunciation patterns. GOP scores are calculated by comparing phone sequences from unrestricted ASR against forced alignment to the actual word sequence. In…
Grammaticality(also: Grammatical correctness, Grammatical acceptability): The degree to which a sentence conforms to the grammatical rules of a language. In accessibility and NLP research, grammaticality is typically assessed via a 5-point Likert subjective judgement (e.g., "This sentence is grammatically correct") and is used as a component of…
Interlingua(also: Interlingual Representation, Interlingual MT): In machine translation, a language-neutral semantic representation that serves as an intermediate form between the source and target languages. An interlingual MT system first analyses the source text into this representation and then generates the target text from it, so the…
Language Model(also: Statistical Language Model, LM): A computational model that assigns probabilities to sequences of words, enabling prediction of likely next words or sentences in text. In assistive technology, language models power word and sentence prediction systems by learning patterns from training corpora. Modern AAC…
Language Understanding Intelligent Service(also: LUIS, Azure LUIS): A cloud-based Microsoft Azure service that applies machine learning to natural language text to predict meaning and extract relevant information. LUIS identifies user intents (what they want to do) and entities (key information in their utterance). In accessibility applications,…
Latent Semantic Analysis(also: LSA, Latent Semantic Indexing, LSI): A natural language processing technique that uses mathematical methods (Singular Value Decomposition) to identify patterns in relationships between words and concepts within a large corpus of text. In accessibility applications, LSA enables context-aware word prediction by…
Levenshtein Distance(also: Edit Distance): A metric that measures the minimum number of single-character edits (insertions, deletions, and substitutions) needed to transform one string into another. In accessibility research, Levenshtein distance is used to quantify how much users modify AI-generated or existing text,…
Lexical Chain(also: Lexical Chaining, Lexical Cohesion): A sequence of semantically related words running through a text — for example, "doctor", "hospital", "nurse", "patient" — connected by relations like synonymy, hypernymy, or hyponymy. Lexical chains capture the topical coherence of a document and are used in readability…
Lexical Elaboration(also: Vocabulary Elaboration): A text adaptation technique that makes content more accessible by adding explanatory information for complex or unfamiliar words, rather than replacing or removing them. Unlike text simplification, which rewrites content using simpler language, lexical elaboration preserves the…
Lexical Semantics: The branch of linguistics concerned with the meaning of words and the relationships between word meanings, including synonymy, antonymy, and the semantic roles words can fill in sentences. In assistive technology, lexical semantic knowledge is used in AAC systems and text…
Lexical Simplification(also: Word-Level Simplification): A form of text simplification that focuses on replacing complex, uncommon, or technical words with simpler, more familiar alternatives. Lexical simplification is particularly relevant for readers with limited vocabulary, including people who are deaf or hard of hearing (for whom…
Machine Translation(also: MT, Automated Translation): Machine translation is the use of computer software to automatically translate text or speech from one language to another. In accessibility contexts, machine translation is particularly relevant to sign language accessibility, where translating written or spoken text into sign…
Multimodal Natural Language Generation(also: Multimodal NLG): Natural language generation systems that produce output coordinated across more than one modality — typically combinations of text or speech with graphics, maps, animation, gesture, or tactile output. Multimodal NLG systems decompose their output into several "channels" that are…
N-gram(also: Bigram, Trigram, Unigram): A contiguous sequence of n items (typically words) from a text, used in language modeling to predict the probability of a word based on its predecessors. A unigram considers single words in isolation, a bigram considers pairs of consecutive words, and a trigram considers…
Named Entity Recognition(also: NER): A natural language processing technique that identifies and classifies named entities in text into predefined categories such as person names, locations, organizations, quantities, and domain-specific terms. In accessibility applications, NER can be used to extract meaningful…
Natural Language Generation(also: NLG, Text Generation): A subfield of artificial intelligence and computational linguistics focused on automatically producing human-readable text from structured data or other non-linguistic representations. In accessibility, natural language generation is used to create textual descriptions of visual…
Natural Language Processing(also: NLP, Computational Linguistics): A branch of artificial intelligence that enables computers to understand, interpret, and generate human language. In accessibility, NLP powers voice-based assistive technologies, automatic captioning, text simplification for cognitive accessibility, and natural language query…
Part of Speech(also: POS, Word Class, POS Tag): A grammatical category assigned to each word (or, in signed languages, each sign) in a sentence — such as noun, verb, adjective, adverb, pronoun, preposition, or conjunction. Automatic part-of-speech tagging is a foundational step in natural language processing pipelines. In…
Part-of-Speech(also: POS, Word Class, Lexical Category): The grammatical category of a word — noun, verb, adjective, adverb, preposition, pronoun, conjunction, determiner, and so on. Part-of-speech labels are the basic output of part-of-speech tagging and a foundational input to many accessibility NLP pipelines: readability…

Category

Search results