Phoneme-based Predictive Text Entry Interface

Ha Trinh, Annalu Waller, Keith Vertanen, Per Ola Kristensson, Vicki L. Hanson · 2014 · Proceedings of the 16th International ACM SIGACCESS Conference on Computers & Accessibility (ASSETS) · doi:10.1145/2661334.2661424

Summary

This paper presents iSCAN-2, an iPad application that enables nonspeaking individuals to enter text by selecting phonemes (speech sounds) rather than spelling words with letters. People with severe speech impairments often face challenges in literacy acquisition, making conventional orthographic (letter-based) text entry difficult or impossible. Phoneme-based entry allows users to construct words by combining sequences of speech sounds they know from spoken language, without needing to know how to spell. The system uses 42 spoken phonemes from the Jolly Phonics literacy programme, organised into 7 groups mapped onto a two-layer pie menu interface designed for users with limited dexterity. The front layer shows 7 phoneme groups (each represented by its most probable phoneme), and selecting a group reveals all phonemes within it on the phoneme layer. Entry follows three steps: select the phoneme group, navigate to the desired phoneme, and move back to the centre to confirm. iSCAN-2 builds on the earlier iSCAN system by adding three rate enhancement strategies: a dynamic phoneme layout that rearranges phoneme groups based on the most likely next phonemes (using a 6-gram phoneme language model), a phoneme set reduction feature that narrows the visible phonemes to only those that can follow the current sequence, and a 5-word prediction pie menu that appears after each phoneme selection, offering probable word completions estimated from a 3-gram word language model considering the current phoneme prefix and up to two prior words.

Key findings

A case study with a nonspeaking participant demonstrated that the rate enhancement strategies improved both text entry speed and error rates compared to the original iSCAN system. The dynamic phoneme layout reduces navigation time by placing the most probable next phonemes in easily accessible positions on the pie menu. The phoneme set reduction eliminates impossible phoneme sequences, preventing errors and reducing the cognitive load of scanning through irrelevant options. The 5-word prediction menu, seamlessly integrated into the pie menu interface, allows users to jump from partial phoneme sequences directly to complete words, significantly reducing the number of phoneme selections needed. The system effectively applies statistical language modelling techniques that are well-established in letter-based text entry (word prediction, n-gram models) to the phoneme domain, where they are equally applicable but had not been previously combined in this way. The pie menu interface is specifically designed for users with limited motor control, requiring only directional movements toward targets rather than precise pointing.

Relevance

This work addresses a critical gap at the intersection of AAC (augmentative and alternative communication) and literacy. Many nonspeaking individuals — including those with cerebral palsy, intellectual disabilities, or acquired conditions — have phonological awareness (they understand speech sounds) but limited orthographic knowledge (they cannot spell). Standard text entry systems assume literacy, creating a barrier that phoneme-based entry bypasses. For accessibility practitioners and AAC specialists, iSCAN-2 demonstrates that the same prediction techniques used to accelerate conventional typing (word prediction, language modelling, dynamic layouts) can be adapted to phoneme-based systems with similar benefits. The Jolly Phonics foundation connects the system to established literacy education practices, potentially supporting both communication and literacy development simultaneously. The pie menu interface design for limited dexterity is also noteworthy — requiring directional gestures rather than precise taps makes the system more accessible to users with motor impairments who commonly accompany speech impairments. The combination of phoneme-level input with word-level prediction offers a practical model for AAC systems that serve users across a range of literacy levels.

Tags: AAC · text entry · phoneme · speech impairment · word prediction · literacy · assistive technology · pie menu