← All terms

Web Information Extraction

Also known as: WIE, Web Data Extraction, Web Scraping

Web Information Extraction (WIE) is a set of techniques for automatically identifying and extracting structured data from web pages. In the context of accessibility, WIE methods are used to analyze the visual rendering of web pages to infer document structure, semantic roles, and content relationships that may not be explicitly marked up in the HTML. For example, WIE-based enrichers can detect headings by analyzing font size and style patterns even when heading tags are missing, identify navigation menus from spatial clustering of links, or determine reading order from geometric layout analysis. These techniques are particularly valuable because they do not depend on correct semantic HTML markup — they work from the rendered page as the browser displays it, extracting the same structural cues that sighted users perceive visually.

Category: Web Development · Artificial Intelligence

Related: Page Segmentation · Navigation Axis · DOM · Screen Reader

Sources