← All reviews

A Platform Agnostic Remote Desktop System for Screen Reading

Syed Masum Billah, Vikas Ashok, Donald E. Porter, IV Ramakrishnan · 2016 · ASSETS '16: Proceedings of the 18th International ACM SIGACCESS Conference on Computers and Accessibility · doi:10.1145/2982142.2982151

Summary

Sinter addresses a fundamental barrier to remote desktop accessibility: traditional remote desktop technology scrapes pixels from the remote screen and redraws them as bitmaps on the client, losing all semantic information that screen readers require—text content, UI element types, hierarchical relationships, and accessibility properties. Existing solutions like NVDARemote and JAWS Tandem work around this by running identical screen readers on both ends, but this requires platform homogeneity since screen readers are locked to specific operating systems due to differences in accessibility APIs (Microsoft's MSAA and UI Automation, Apple's Accessibility, GNOME's ATK and AT-SPI). Sinter eliminates this platform dependency through a novel intermediate representation (IR) approach. On the remote system, a "scraper" extracts the UI model from the native accessibility API and converts it to a generic XML-based IR—analogous to an HTML DOM tree. This IR is transmitted to the client, where a "proxy" converts it back to native UI widgets that the local screen reader can interpret. The system relays user inputs back to the scraper and receives incremental UI updates. The demonstration showcased bidirectional cross-platform access: Mac users running VoiceOver could access Windows applications (Word, Calculator, Explorer, Registry Editor, Command Line), while Windows users with JAWS or NVDA could access Mac applications (Apple Mail, HandBrake, Messages, Calculator, Contacts).

Key findings

A preliminary user study with 21 blind participants at Lighthouse Guild yielded a System Usability Scale (SUS) score of 78, considered good from a usability perspective. More significantly, qualitative feedback revealed strong user enthusiasm for two key aspects: First, participants valued using their preferred screen reader with customized settings rather than learning new platforms or screen readers. Screen reader users often spend years developing efficient workflows with specific settings, keyboard shortcuts, and verbosity preferences—Sinter preserves this investment across platforms. Second, participants noted that Sinter eliminates the need for a screen reader to be installed on the remote host, a common real-world constraint. Many enterprise and cloud environments do not have screen readers installed, effectively locking out blind users from remote access scenarios. The IR approach proved architecturally efficient: each platform requires only two relatively simple conversions (native API to IR, and IR to native API), implementable in a few thousand lines of code per direction. By comparison, the NVDA screen reader alone exceeds 50,000 lines. The IR also enables meta-programming opportunities—the researchers demonstrated a "mega-ribbon" modification for Microsoft Word that surfaces frequently-used buttons, bypassing the cumbersome standard ribbon navigation.

Relevance

This work addresses a critical gap in workplace accessibility. Remote desktop access is essential for telecommuting, distance learning, cloud computing, and IT support—all increasingly common scenarios. The platform lock-in of screen readers has meant that blind users could not access remote systems running different operating systems, a significant barrier as computing environments become more heterogeneous. For organizations, Sinter suggests that accessible remote access does not require installing screen readers on every remote system or maintaining platform homogeneity. The lightweight scraper component could potentially be deployed on servers without the full complexity of screen reader software. The IR transformation concept has implications beyond remote access: it provides a framework for making accessibility enhancements that are "missing in the original application" without modifying the application itself. This meta-programming potential—demonstrated with the Word mega-ribbon—represents a new paradigm for assistive technology that adapts interfaces rather than simply reading them. The core technical contribution (presented in detail at EuroSys 2016) demonstrates that cross-platform screen reader interoperability is achievable with modest engineering effort when semantic UI information is preserved rather than discarded.

Tags: screen reader · remote desktop · cross-platform accessibility · UI virtualization · blind accessibility · assistive technology · NVDA · JAWS · VoiceOver