iSCAN: A Phoneme-based Predictive Communication Aid for Nonspeaking Individuals
Ha Trinh, Annalu Waller, Keith Vertanen, Per Ola Kristensson, Vicki L. Hanson · 2012 · Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2012) · doi:10.1145/2384916.2384927
Summary
This paper presents iSCAN (Interactive Sound-based Communication Aid for Non-speakers), a phoneme-based predictive communication system designed for people with severe speech impairments (SSI) who lack conventional literacy skills. The core problem is that most augmentative and alternative communication (AAC) devices rely on orthographic (spelling-based) input, which excludes the many SSI users who have significant literacy deficits. Phoneme-based systems offer an alternative by allowing users to construct words from speech sounds rather than letters, but previous phoneme-based AAC systems suffered from slow communication rates, difficult access methods, and high learning demands. The authors developed a statistical language model using 6-gram phoneme mixture models and 3-gram word mixture models, trained on crowdsourced AAC-like text converted to phoneme sequences via a pronunciation dictionary. iSCAN incorporates this model into an eight-slice two-layer pie menu interface adapted from the Jolly Phonics teaching system, providing access to 42 English phonemes through only 9 selection targets. The system dynamically rearranges the phoneme layout after each selection to place the most probable next phonemes in the most accessible positions, and offers automatic word completion based on entered phoneme prefixes.
Key findings
A formative study with 16 able-bodied participants demonstrated that the predictive features produced a 108.4% increase in phoneme entry speed (from 11.07 to 23.07 phonemes per minute) and a 79.0% reduction in phoneme error rate (from 9.19% to 1.93%). Word error rates dropped from 17.15% to 3.63%. All 16 participants preferred the predictive setting, with 13 of 16 finding the dynamic phoneme layout useful and 15 of 16 valuing the word auto-completion feature. A longitudinal case study with a 41-year-old participant with cerebral palsy ("Alex") showed that after 16 sessions over 11 days, his phoneme entry speed increased from 4.45 to 18.53 phonemes per minute and word entry speed rose from 1.22 to 4.82 words per minute with near-zero error rates. Critically, in a comparative evaluation against two orthographic-based systems Alex had used for years (Say-It!Sam and Assistive Chat), iSCAN achieved 0.0% word error rate compared to 19.17% and 21.94% respectively. Alex ultimately preferred iSCAN over both established systems.
Relevance
This research demonstrates that phoneme-based input can be a viable and even superior alternative to spelling-based AAC for users with limited literacy. The finding that a user with years of experience on orthographic devices performed better with iSCAN after just 16 sessions is particularly striking and challenges the assumption that letter-based input is always preferable. For accessibility practitioners, this highlights the importance of not assuming literacy as a prerequisite for text generation in communication aids. The pie menu interface design, with its minimal selection targets and adaptability to various input devices (touchscreen, joystick, eye-tracking), offers a useful model for designing motor-accessible interfaces more broadly. The work also underscores the value of phonological awareness training as a complement to technology-based interventions for nonspeaking individuals.
Tags: augmentative and alternative communication · phoneme-based communication · predictive text · word prediction · severe speech impairment · cerebral palsy · assistive technology · phonological awareness · motor impairment