A Longitudinal Evaluation of Tablet-Based Child Speech Therapy with Apraxia World
Adam Hair, Kirrie J. Ballard, Constantina Markoulli, Penelope Monroe, Jacqueline McKechnie, Beena Ahmed, Ricardo Gutierrez-Osuna · 2021 · ACM Transactions on Accessible Computing · doi:10.1145/3433607
Summary
This paper presents Apraxia World, a tablet-based speech therapy game designed for long-term home practice by children with speech sound disorders (SSDs), particularly childhood apraxia of speech (CAS). Unlike many therapy games that use simple arcade mechanics and quickly become tedious, Apraxia World is a full 2D platformer with 40 levels across five worlds, seven characters, an in-game store, and extensive personalization options. Speech exercises are integrated as a secondary input—players collect stars scattered throughout levels, which trigger pronunciation exercises. The game uses template matching (TM) for automatic pronunciation evaluation, comparing new utterances against previously collected correct and incorrect samples from each child. This approach requires minimal data (a few recordings per word) and runs locally on the tablet without internet connectivity. A companion app allows speech-language pathologists (SLPs) to select target words based on each child's specific needs and collect calibration recordings.
Key findings
A longitudinal study with 10 children (ages 5-12) with SSDs showed therapeutically significant speech improvements over two counterbalanced 4-week treatment phases. Children averaged 56.6% absolute improvement with automated feedback and 61.5% with caregiver feedback—comparable to improvements reported for traditional clinician-based therapy of similar intensity. Children remained engaged throughout the multi-month study, spending an average of 19.5 minutes per day playing and completing approximately 76 speech exercises daily. Nine of 10 children continued playing after the study ended, and all caregivers reported children were engaged. Game personalization was highly popular: all children purchased clothing items (averaging 26 purchases each), and the in-game store was frequently cited as a favorite feature. The template matching algorithm achieved 72% F1 score for mispronunciation detection, outperforming both the standard Goodness of Pronunciation (GOP) method (69% F1) and caregiver evaluations (41% F1). Notably, caregivers were lenient evaluators with only 27% recall, missing many mispronunciations.
Relevance
This research demonstrates that game-based speech therapy can maintain child engagement over extended treatment periods while producing clinically meaningful outcomes. For accessibility practitioners, the study offers several design insights: limiting daily play (one level per day) built anticipation and prevented rapid completion; integrating exercises as collectible items rather than primary controls avoided frustrating children who struggle with target sounds; and extensive personalization options (characters, costumes, weapons) kept children invested. The automatic pronunciation evaluation approach is notable for requiring minimal setup—just a few calibration recordings per target word—making it practical for clinical deployment. However, audio quality issues (54% of recordings had problems) highlight the need for better recording mechanisms in child-facing applications. The study also raises equity concerns: males are 2.85 times more likely to have SSDs, and all but one participant was male, suggesting targeted recruitment of female participants is needed. Future work should explore transparency in automated feedback and the potential for therapy games to normalize speech practice among peers.
Tags: speech therapy · childhood apraxia of speech · speech sound disorders · serious games · games for health · automatic speech recognition · mispronunciation detection · children · mobile health