StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham · 2019 · ACM Symposium on User Interface Software and Technology · doi:10.1145/3332165.3347873

Summary

This paper presents StateLens, a three-part system that makes existing dynamic touchscreen interfaces accessible to blind users without requiring any modification to the touchscreen hardware or software. Blind people routinely encounter inaccessible touchscreens on coffee machines, payment terminals, subway ticket machines, in-flight entertainment systems, and other public devices that are impossible to use independently. StateLens addresses three core challenges: touchscreens are inherently visual, interfaces change dynamically across multiple screens, and blind users risk accidentally triggering actions while trying to explore the screen. The system works by first reverse engineering a touchscreen's underlying state diagram from point-of-view usage videos (found online or recorded by sighted volunteers) using a hybrid crowd-computer vision pipeline that combines SURF feature detection, OCR, screen detection via Amazon Rekognition, and crowdsourced labeling. Second, StateLens automatically generates a conversational agent using Google Dialogflow that guides blind users through prespecifying what task they want to accomplish before touching the device. Third, a set of 3D-printed accessories (finger caps and conductive styluses) enable "risk-free exploration" of capacitive touchscreens by letting users explore without accidentally triggering touches.

Key findings

A formative study with 16 blind participants revealed that accessing public touchscreens often requires asking strangers for help, raising serious privacy concerns especially with financial transactions and medical kiosks. The technical evaluation across 12 different touchscreen interfaces (coffee machines, ATMs, printers, subway ticket machines, etc.) using 28 videos showed that combining screen detection, SURF features, and OCR achieved the best state reconstruction performance, with precision and recall generally above 0.7. StateLens maintained stable processing time (~5fps) and error rates (~5%) as the number of interface states increased, unlike baseline approaches that degraded linearly. A user study with 14 blind participants demonstrated that the 3D-printed accessories effectively enabled risk-free exploration (finger cap: M=0.05 accidental triggers; stylus: M=0.03), the conversational agent achieved 100% task completion for prespecifying orders (M=53.7 seconds), and the complete system achieved 94.7% task completion for realistic multi-step tasks. Participants rated all components as highly useful (M=6.4-6.6 out of 7) and easy to learn (M=5.5-6.3).

Relevance

StateLens addresses one of the most pervasive and frustrating everyday accessibility barriers — the proliferation of touchscreen-only interfaces in public spaces that completely exclude blind users. Unlike solutions that require manufacturers to build in accessibility (which they rarely do for embedded devices), StateLens works with touchscreens as they exist in the wild, making it immediately practical. For accessibility practitioners, the paper highlights that touchscreen inaccessibility is not just an inconvenience but a privacy and independence issue, as blind users are forced to share sensitive information (PINs, medical data) with strangers. The 3D-printed accessories for risk-free exploration are an elegantly simple innovation that could be independently useful. For organizations deploying public kiosks, this research underscores the urgent need to build accessibility into touchscreen interfaces from the start, while demonstrating that post-hoc solutions, though not ideal, can meaningfully bridge the gap.

Tags: touchscreen accessibility · blindness · computer vision · crowdsourcing · reverse engineering · conversational agents · 3D printing · assistive technology · kiosk accessibility · state diagrams