Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Search results

Accessibility dataset(also: Disability-inclusive dataset, Accessible benchmark): A publicly available research dataset that includes data collected from people with disabilities, enabling algorithm development and benchmarking on representative populations rather than exclusively on non-disabled participants. Examples include WeAllWalk (inertial data from…
Benchmark dataset(also: Evaluation dataset, Test benchmark): A standardized dataset used to evaluate and compare the performance of AI models, algorithms, or systems against established baselines. In accessibility, the absence of benchmark datasets that include people with disabilities means disparate performance across disability…
Computational Notebook(also: Jupyter Notebook, Data Science Notebook, IPython Notebook): A computational notebook is an interactive document that combines executable code, rich text, data visualizations, and narrative explanations in a single shareable format. Widely used in data science, research, and education through platforms like Jupyter, Google Colab, and…
Data Descriptor(also: Training Data Descriptor): An automated metric or feedback mechanism that characterizes the quality or properties of a dataset, particularly training images used in machine learning. In accessibility research, data descriptors provide non-visual feedback to blind users about the quality of photos they…
Data Representativeness(also: Dataset Representativeness, Demographic Representativeness): The degree to which a dataset reflects the diversity of the population it is intended to serve, particularly across demographic dimensions such as age, gender, race, ethnicity, disability, and socioeconomic status. In AI and machine learning, unrepresentative training data leads…
Data Stewardship(also: Dataset Stewardship, Data Governance): The responsible management of data throughout its lifecycle, including decisions about collection, storage, access, sharing, and disposal. In accessibility research, participatory data stewardship involves disabled data contributors in decisions about how their data is used,…
Dataset Bias(also: Training Data Bias, Data Representation Bias, Sampling Bias): A systematic skew in the composition of training data used to build machine learning models, resulting in models that perform well for overrepresented groups but poorly for underrepresented ones. In accessibility contexts, dataset bias is a pervasive problem: activity…
Datasheets for datasets(also: Dataset documentation, Data cards): A standardized documentation framework proposed by Gebru et al. that accompanies machine learning datasets with information about their creation, composition, intended use, and limitations. For accessibility, datasheets help surface representation gaps — such as whether people…
Differential privacy(also: DP): A mathematical framework for sharing statistical information about a dataset while providing provable guarantees that individual records cannot be identified. In accessibility contexts, differential privacy is proposed as a way to resolve the tension between collecting…
IncluSet: A dataset surfacing repository created by researchers at the University of Maryland that catalogs and organizes accessibility datasets — datasets sourced from people with disabilities and older adults. IncluSet was developed to make it easier for AI researchers and practitioners…
Information Extraction(also: IE, Data Extraction): The process of automatically identifying and retrieving structured information from unstructured or semi-structured data sources. In the context of accessibility and data visualization, information extraction refers to how users — particularly screen-reader users — pull specific…
LSTM(also: Long Short-Term Memory, LSTM Network): A type of recurrent neural network architecture designed to learn long-term dependencies in sequential data by using special gating mechanisms that control the flow of information through the network. LSTMs are particularly effective for processing time-series data such as…
Re-identification risk(also: De-anonymization risk, Data re-identification): The possibility that an individual can be identified from supposedly anonymized data by combining multiple data points or matching against external datasets. People with disabilities face heightened re-identification risk because uncommon combinations of attributes — rare…
Topic Modeling(also: LDA, Latent Dirichlet Allocation): A machine learning technique that automatically discovers abstract themes or topics within a collection of documents by analyzing patterns of word co-occurrence. Latent Dirichlet Allocation (LDA) is the most widely used topic modeling algorithm. In accessibility research, topic…
Training Data(also: Training Set, Training Dataset): The collection of labeled examples used to teach a machine learning model to perform a specific task. The quality, quantity, and diversity of training data directly determine how well a model will perform. In accessibility contexts, training data quality is especially important…
Usage analytics(also: Telemetry, Interaction logging): The collection and analysis of data about how users interact with a technology system in real-world settings, including session duration, feature usage frequency, settings preferences, and interaction patterns over time. In assistive technology research, large-scale usage…
Web Mining(also: Web Data Mining, Web Content Mining): The application of data mining techniques to extract and discover useful information from web data, including web content, structure, and usage patterns. In accessibility evaluation, web mining can be used to analyse source code and DOM structures at scale to identify…

17 results.

Category

Search results