← All reviews

Architecture of an Automated Therapy Tool for Childhood Apraxia of Speech

Avinash Parnandi, Virendra Karappa, Youngpyo Son, Mostafa Shahin, Jacqueline McKechnie, Kirrie Ballard, Beena Ahmed, Ricardo Gutierrez-Osuna · 2013 · Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS) · doi:10.1145/2513383.2513450

Summary

This paper presents a multi-tier client-server system for remotely administering speech therapy to children with childhood apraxia of speech (CAS), a neurological speech sound disorder that impairs the precision and consistency of oro-motor planning and execution needed for speech production. CAS affects an estimated 5-6% of children, but the ratio of children needing therapy to available qualified clinicians is growing, creating a gap between needed and available therapeutic intervention. The system implements the Nuffield Dyspraxia Programme (NDP3), a bottom-up intervention protocol for children aged 3-7 with severe speech sound disorders, which builds skills from single sounds through syllables, words, and sentences. The architecture consists of three components: a tablet-based mobile app where children practice exercises using picture-based stimuli, a Moodle-based server that manages therapy courses and hosts a speech analysis engine, and a web-based clinician interface for assigning exercises, reviewing recordings, and monitoring progress. The speech analysis engine uses Hidden Markov Model (HMM) decoding trained on 40 hours of child speech to identify mispronunciations (insertions, deletions, substitutions), a voice activity detector to identify articulatory struggle, and a lexical stress classifier (88.7% accuracy) to detect prosodic errors. The system was validated through a pilot study with four children (ages 3-7) with clinically diagnosed CAS, their parents, and four speech therapists.

Key findings

The pilot study demonstrated that the tablet-based therapy system was well-received by all stakeholder groups. Children were engaged during sessions and most could work independently after brief demonstration, with the youngest child (age 3) requiring continued assistance. Children particularly enjoyed being able to record and play back their own voice. Therapists valued the ability to remotely assign exercises, listen to recordings, and monitor progress — one commented that most speech pathology apps "are not any good" but this system compared favorably. All four children said they would like to do the exercises again, and one asked to take the tablet home. Parents indicated they would use it daily if available at home. Key areas for improvement included adding gamification elements (rewards, animations, badges, background music), incorporating real-time speech error feedback, adding audio prompts for each image to model correct pronunciation, and supporting offline mode for sessions without internet connectivity. Therapists emphasized the need for real-time alerts when children produce errors, to prevent repeated incorrect practice.

Relevance

This research addresses a critical accessibility gap: children with speech sound disorders need intensive, frequent therapy that is often unavailable due to clinician shortages, geographic barriers, and cost. The system demonstrates how mobile technology combined with automated speech analysis can extend therapeutic intervention into the home, complementing rather than replacing face-to-face sessions. For accessibility practitioners, the work highlights important design considerations for children's therapeutic applications: large touch targets to accommodate lower motor dexterity, preventing accidental multiple recordings from holdovers (repeated button presses), and the importance of gamification for sustained engagement with young users. The automated speech analysis pipeline — combining voice activity detection, HMM-based phoneme recognition, and lexical stress classification — represents a practical approach to providing feedback without requiring a clinician to be present, though the pilot feedback makes clear that real-time feedback is essential for preventing children from practicing errors unsupervised.

Tags: childhood apraxia of speech · speech therapy · speech sound disorder · automated speech analysis · tele-rehabilitation · mobile applications · speech recognition