← All reviews

Automatic Detection of Phone-Based Anomalies in Dysarthric Speech

Imed Laaridh, Corinne Fredouille, Christine Meunier · 2015 · ACM Transactions on Accessible Computing (TACCESS) · doi:10.1145/2739050

Summary

This research develops automatic methods to detect and localize acoustic anomalies in speech produced by people with dysarthria, a motor speech disorder caused by neurological damage affecting the respiratory, phonatory, resonatory, articulatory, or prosodic components of speech. The goal is to assist clinicians by automatically flagging atypical speech segments for expert review, reducing the time-intensive manual analysis currently required for diagnosis and therapy monitoring. Two detection approaches are evaluated. The baseline system models only normal speech using Hidden Markov Models (HMMs) trained on 200 hours of French broadcast speech, then identifies phones that deviate from expected acoustic patterns. The novel SVM-based approach models both normal and abnormal speech, using features derived from phone alignment outputs to classify each phone as normal or abnormal. The system produces visual "normality maps" showing color-coded phones (blue=normal, red=abnormal) to help clinicians quickly identify problem areas. The evaluation uses two French corpora: Corpus 1 contains 8 speakers with lysosomal storage disease (a rare condition causing mixed dysarthria) recorded longitudinally over 2 years, plus 6 controls. Corpus 2 is larger with 118 dysarthric speakers across three conditions—ALS (37), Parkinson's disease (31), and cerebellar ataxia (21)—plus 29 controls. All speakers read the same French fairytale text.

Key findings

The SVM-based classification system significantly outperformed the baseline. For detecting abnormal phones, the SVM approach achieved 0.81 recall (detecting 81% of anomalies annotated by experts) compared to 0.74 for the baseline using the "one-phone delay" evaluation strategy that accounts for boundary alignment differences. Precision remained moderate at 0.63, indicating the system tends to flag more anomalies than human experts—potentially acceptable for a screening tool. Performance correlated strongly with dysarthria severity: automatic phone alignment achieved 96% agreement with expert segmentation for control speakers but degraded to 70-74% for severely dysarthric speakers. Critically, the automatic anomaly detection rate showed high correlation (0.86-0.91) with expert perceptual ratings of dysarthria severity, articulation impairment, and intelligibility across all disease types in Corpus 2. This suggests the system captures clinically meaningful speech differences. The system performed better on speakers with more severe dysarthria—perhaps counterintuitively, but likely because anomalies are more acoustically distinct. Performance was consistent across diseases (Parkinson's, ALS, lysosomal storage disease) except cerebellar ataxia, which showed weaker correlations.

Relevance

This work addresses a practical clinical bottleneck: perceptual evaluation of dysarthric speech is time-consuming, subjective, and requires specialized expertise. Automatic anomaly detection could enable more frequent monitoring, earlier detection of progression, and more objective therapy outcome measurement. The visual normality maps provide an intuitive interface for clinicians to review flagged segments rather than listening to entire recordings. For AAC and speech recognition developers, this research has direct implications. Understanding which phones are most affected helps in adapting ASR systems for dysarthric speakers—a known challenge since most speech recognition is trained on typical speech. The finding that the system generalizes across disease types (with cerebellar ataxia as an exception) suggests potential for broad clinical applicability. The moderate precision (many false positives) indicates the system works best as a screening tool to guide expert attention rather than as a standalone diagnostic.

Tags: dysarthria · speech recognition · automatic speech processing · motor speech disorders · clinical assessment · machine learning · phonetics · assistive technology

Standards referenced: Mayo Clinic classification of dysarthrias · Frenchay Dysarthria Assessment Test