Perspectives on Speech and Language Interaction for Daily Assistive Technology: Introduction to Part 1 of the Special Issue

Heidi Christensen, Frank Rudzicz, François Portet, Jan Alexandersson · 2015 · ACM Transactions on Accessible Computing (TACCESS) · doi:10.1145/2756765

Summary

This editorial introduces the first part of a TACCESS special issue on speech and language interaction for daily assistive technology, emerging from the 2013 SLPAT (Speech and Language Processing for Assistive Technologies) workshop. The editors frame speech and natural language as critical interaction modalities for people with communication disorders, noting multiple application areas: improving intelligibility of pathological speech, providing specialized speech recognition for conditions like cerebral palsy or Parkinson's disease, and supporting clinical assessment through automated tools. The special issue spans engineering to clinical sciences, reflecting the interdisciplinary nature of assistive speech technology research. This first installment focuses specifically on automatic assessment of disordered speech—measuring speech capability and intelligibility computationally rather than through time-consuming manual transcription by human listeners.

Key findings

The introduction previews three articles addressing automatic speech assessment: First, Pellegrini et al. investigate "goodness of pronunciation" (GOP) scores derived from ASR to detect pronunciation discrepancies in disordered speech. By comparing phone sequences from unrestricted ASR against forced alignment, their method correlates GOP scores with speaker impairment levels and speech comprehensibility. Notably, 70.2% of mispronunciations could be automatically detected—valuable for clinical settings and educational applications like reading tutors. Second, Laaridh et al. address automatic detection of phone-based anomalies in dysarthric speech using support vector machines. Their SVM approach outperforms baseline systems and produces automatically estimated measures (intelligibility, articulation impairment) that correlate with expert ratings from 11 judges. Third, Martínez et al. apply iVectors—a technique from speaker recognition—to automatically estimate speech intelligibility. The method shows high correlation between automatic and manual intelligibility assessments, and can predict ASR performance on disordered speech, which is critical for prescribing appropriate assistive technologies.

Relevance

For accessibility practitioners, this editorial highlights an important but often overlooked area: speech-based interfaces remain inaccessible to many users with motor or cognitive impairments that affect speech production. Standard ASR systems trained on typical speech perform poorly on dysarthric or disordered speech, creating a significant accessibility gap. The research directions outlined here—automatic intelligibility assessment, pronunciation error detection, and ASR performance prediction—have practical implications. Clinicians need efficient assessment tools; AT practitioners need to predict whether voice interfaces will work for specific users; and developers of speech-based assistive technologies need training data and evaluation methods for atypical speech. The emergence of SLPAT as a dedicated research community bridging computational linguistics and assistive technology signals growing recognition that speech accessibility requires specialized approaches beyond mainstream ASR. For organizations deploying voice interfaces, understanding the limitations for users with speech differences is essential for inclusive design.

Tags: speech recognition · disordered speech · dysarthria · speech intelligibility · assistive technology · automatic speech recognition · clinical assessment · speech pathology · natural language processing