Optical Character Recognition (OCR)

Also known as: OCR, Text Recognition

Technology that converts images of text — such as scanned documents, photographs of signs, or PDF pages stored as images — into machine-readable text that can be processed by screen readers, search engines, and other software. OCR is a critical tool for making scanned documents accessible to people with print disabilities, but its accuracy varies significantly by language and script. While OCR for Latin scripts is relatively mature, recognition of scripts like Arabic, Chinese, and Devanagari remains less reliable due to their visual complexity, connected letter forms, and diacritical marks. Even high-quality OCR output typically requires manual proofreading.

Category: technology · document accessibility · assistive technology

Related: Print Disability · Document Accessibility

Sources

https://doi.org/10.1145/2596695.2596712