Glossary

Terms used in accessibility research and practice. Each entry has a definition, common aliases, and category tags.

Category: computer vision

Filter

Search results

OCR (Optical Character Recognition)(also: OCR, Optical Character Recognition, Text Recognition): A computer-vision technology that converts images of printed, handwritten, or on-screen text into machine-readable character data. OCR is foundational to a wide range of accessibility tools: extracting alt-text for image-based PDFs, reading labels for screen-reader users (e.g.,…
ORBIT Dataset(also: Object Recognition for Blind Image Training): A disability-first machine learning dataset for teachable object recognition, contributed by people who are blind or have low vision. The original ORBIT dataset (Massiceti et al., 2021) contains 3,822 videos of 486 objects from 67 data collectors, predominantly in the UK and…
Object Detection(also: Object Recognition): A computer vision technique that identifies and locates specific objects within images or video frames, typically by drawing bounding boxes around detected items and classifying them. In video accessibility, object detection enables automatic identification of video elements…
Object Recognition(also: Object Detection): A computer vision capability that identifies and classifies objects within images or video frames. In visual assistance technologies, object recognition enables automated description of what the camera captures, helping blind users identify items in their environment. However,…
Object Status Recognition(also: Object State Recognition, Object Transformation Detection): The computer vision task of identifying the current condition or transformation state of objects, such as whether an ingredient is raw, chopped, sauteed, or blended. Object status recognition goes beyond simple object detection (identifying what is present) to understand how…
Open-Vocabulary Detection(also: Open-Vocabulary Object Detection, OVD): A class of computer vision object detection models that accept arbitrary text queries at inference time rather than being restricted to a fixed set of pre-trained classes. Instead of only recognizing, for example, the 80 COCO categories, an open-vocabulary detector (such as…
OpenPose: An open-source computer vision library developed by Carnegie Mellon University that detects human body, hand, facial, and foot keypoints in real-time from images or video. OpenPose extracts 25 body keypoints, 21 keypoints per hand, and 70 facial landmarks, providing a skeletal…
Optical Flow: A computer vision method that estimates the apparent motion of objects between consecutive video frames by tracking pixel displacement patterns. Optical flow calculates velocity vectors showing movement direction and speed across an image. In assistive technology, optical flow…
Optical Music Recognition(also: OMR): Computer vision technology that automatically converts images of printed or handwritten music notation into machine-readable digital formats such as musicXML. OMR is analogous to OCR (Optical Character Recognition) for text. While OMR can potentially streamline the creation of…
Overlay Detection(also: Overlay Recognition): The process of automatically identifying graphical or textual elements overlaid on top of video content, such as pop-up graphics, watermarks, banners, subtitles, logos, and text annotations. Overlay detection uses computer vision techniques including edge detection, shape…

10 results.

Category

Search results