Automatic Role Detection of Visual Elements of Web Pages for Automatic Accessibility Evaluation
Carlos Duarte, Ana Salvado, M. Elgin Akpinar, Yeliz Yeşilada, Luís Carriço · 2018 · Proceedings of the 15th International Web for All Conference (W4A 2018) · doi:10.1145/3192714.3196827
Summary
This short paper presents an approach to automatically detect the visual roles of web page elements — specifically menus and lists — to enable automated accessibility evaluation tools to assess WCAG techniques that require understanding an element’s semantic role, not just its HTML syntax. The problem is that several WCAG techniques (such as H97, which requires grouping related links using the nav element, and H48, which requires using ol, ul, and dl for lists) cannot be evaluated by purely syntactic analysis because they depend on understanding what role a visual element plays on the page. For example, to check H97 compliance, an evaluator must determine whether links are "visually grouped and represent a section of the page" — a judgement that automated tools cannot make from HTML alone, since many websites implement menus using non-semantic markup (div elements with CSS styling rather than nav elements). The approach extends the Vision-based Page Segmentation (VIPS) algorithm, which segments web pages into visually coherent blocks, with a role detection component that uses heuristic rules based on an ontology of visual elements. For the menu role (LinkMenu), detection rules consider properties such as: the element may use ul, ol, dl, or nav tags; its id/class/source attributes may contain "menu" or "nav" keywords; it usually has a different background colour from its parent; it appears at the top of the page; its content is generally short (1-5 words per link); and it has a fixed position across all pages of the same site. The system was integrated with QualWeb, an automated WCAG 2.0 evaluation tool that previously could only assess 34 HTML and 13 CSS techniques.
Key findings
The menu detection rules were iteratively refined through three rounds of evaluation against a benchmark dataset of 132 menus from 30 WordPress sites. The initial extended VIPS correctly identified 60 menus but missed 72 and had 21 false positives (precision 0.741, recall 0.455, F-measure 0.564). After modifications to increase recall — including adding rules to leverage WAI-ARIA menu roles and class names containing "menu-item" — correctly identified menus increased to 83, but false positives rose to 63 (precision 0.568, recall 0.629, F-measure 0.597). Three successive filtering stages were applied to reduce false positives: removing non-menu HTML elements (article, header, aside, composite, p, table, headings) from menu classification; filtering out text and section elements; and removing elements without children containing links (ul, dl, ol, li, a, img). The best-performing filter achieved precision 0.606, recall 0.606, and F-measure 0.606. When evaluated on two new sets of 15 pages each (WordPress and non-WordPress), the WordPress set yielded 87% precision for menus and 86% for lists, while the non-WordPress set achieved 92% precision for menus and 100% for lists. Integration with QualWeb enabled it to assess techniques H48 and H97 for the first time — techniques that no fully automated evaluation tool had previously been able to check.
Relevance
This paper addresses a fundamental limitation of automated accessibility evaluation: the gap between what can be checked syntactically and what requires visual or semantic understanding. The majority of automated tools can only verify a subset of WCAG success criteria — QualWeb could assess only 47 techniques before this work — leaving many checks that require human judgement. By bridging visual page segmentation with accessibility evaluation, this approach expands the scope of automated testing. For accessibility practitioners and tool developers, the specific heuristic rules for menu detection provide a starting point that could be extended to other visual roles (headers, footers, sidebars, content areas). The finding that many websites implement menus using non-semantic markup (div elements instead of nav) reinforces a well-known problem: developers often create visually correct pages that are semantically meaningless to assistive technologies. The modest F-measure (0.606) and the variation between WordPress and non-WordPress sites highlight the difficulty of generalising visual role detection across the diversity of web development practices. This work is a step toward more comprehensive automated evaluation, though human expert review remains necessary for many WCAG requirements.
Tags: automated testing · web accessibility · WCAG · web page segmentation · navigation · screen readers · WAI-ARIA · menus · accessibility evaluation · QualWeb
Standards referenced: WCAG 2.0 · WAI-ARIA