← All reviews

Development and Theoretical Evaluation of Optimized Phonemic Interfaces

Gabriel J. Cler, Cara E. Stepp · 2017 · Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '17) · doi:10.1145/3132525.3132537

Summary

This paper presents the development and computational evaluation of optimized phonemic communication interfaces for augmentative and alternative communication (AAC) users. Unlike traditional letter-based (orthographic) interfaces like QWERTY keyboards, phonemic interfaces allow users to select sounds (phonemes) rather than letters to compose messages. This approach offers two key advantages for users with motor impairments: American English has 14-20% fewer phonemes than letters per word, reducing the number of selections needed, and phoneme selection enables text-to-speech output of any word or utterance without being restricted to a predefined vocabulary. The researchers used the Metropolis algorithm (a Markov chain Monte Carlo optimization technique) to arrange 39 phonemes on hexagonal target grids, optimizing layouts based on phoneme-to-phoneme transition likelihoods from five different corpora: an actual AAC user's vocabulary, suggested AAC conversational phrases compiled by specialists, simulated AAC messages from Mechanical Turk, and two versions of the Buckeye Corpus of conversational speech (dictionary-based and direct phonetic transcriptions). Efficiency was calculated using Fitts' law, which models movement time based on target distance and size. Seven interfaces were evaluated: five corpus-optimized layouts, an alphabetically arranged layout, and a previously developed articulatory interface that groups phonemes by manner and place of articulation.

Key findings

Phoneme-to-phoneme transition likelihoods were highly correlated across all five corpora (Pearson's r = 0.70-0.86), with text-based AAC corpora showing the highest within-group correlations (r = 0.85-0.86). Optimized interfaces achieved efficiencies of 36.5-39.6 words per minute (WPM), representing a 20-30% improvement over random phoneme arrangements (~30 WPM) and a 19-31% improvement over the non-optimized Alphabetic and Articulatory interfaces (both ~30 WPM). When optimizing with one corpus and testing against another, efficiency varied by only 3-5%, suggesting that the specific corpus used for optimization matters relatively little. The best absolute efficiency (39.6 WPM) was achieved when optimizing and testing on the same corpus. Time estimates for producing a standard set of 1,004 AAC messages showed the optimized phonemic interface requiring 54 hours with a stylus vs. 111 hours with a QWERTY keyboard — roughly a 50% reduction. When recalculated with Fitts' constants from a spinal cord injury user employing EMG control, these dropped to 102 hours (phonemic) vs. 210 hours (QWERTY). The Alphabetic and Articulatory interfaces, while less efficient computationally, offer potential learnability advantages — the Articulatory layout groups similar sounds together, which may also provide error tolerance since accidentally selecting a neighbouring phoneme produces a similar sound.

Relevance

This research has practical implications for AAC device designers and clinicians working with people who have severe motor impairments. The finding that corpus-optimized phonemic interfaces can cut communication time by roughly 50% compared to QWERTY keyboards is significant for users who may spend many minutes composing a single message. The robustness of optimization across different corpora is reassuring — it means that even without a corpus of the specific user's communication, optimizing with any relevant phonemic data still yields substantial efficiency gains. For accessibility practitioners, the paper also demonstrates that interface optimization techniques well-established for orthographic keyboards can be successfully adapted for phonemic AAC, and that the choice between phonemic and orthographic interfaces involves trade-offs between efficiency, learnability, and flexibility. Limitations include the theoretical (computational) rather than empirical nature of the evaluation — actual user performance with motor impairments, learning curves, and error rates remain to be studied. The paper also does not incorporate predictive text, which could further increase communication rates.

Tags: augmentative and alternative communication · motor disability · input methods · interface design · speech synthesis · communication accessibility · optimization