Detecting Readers with Dyslexia Using Machine Learning with Eye Tracking Measures
Luz Rello, Miguel Ballesteros · 2015 · Proceedings of the 12th International Web for All Conference (W4A) · doi:10.1145/2745555.2746644
Summary
This paper presents the first machine learning model to automatically detect readers with dyslexia using eye tracking data. The authors trained a Support Vector Machine (SVM) binary classifier on a dataset of 1,135 readings from 97 Spanish-speaking participants aged 11 to 54 — 48 with a confirmed dyslexia diagnosis and 49 without. Each participant read 12 texts presented in different typefaces while their eye movements were recorded using a Tobii 1750 eye tracker. The research builds on decades of psychology and HCI studies showing that people with dyslexia exhibit distinct eye movement patterns: longer fixations, more fixations, shorter saccades, and more regressions than typical readers. However, prior work had only identified statistical differences between groups — no one had applied machine learning to predict dyslexia from these measures. The texts were carefully controlled for comparability: all 60 words long, from the same book, with similar word lengths, and presented in standardized layouts following British Dyslexia Association guidelines. The study also explored how different features contributed to classification accuracy, testing variables including reading time, fixation duration, number of fixations, typeface properties, and participant age.
Key findings
The SVM model achieved 80.18% accuracy in a 10-fold cross-validation experiment, correctly classifying 910 out of 1,135 readings. The three most informative features were reading time, mean fixation duration, and participant age. Importantly, typeface-related features (serif vs. sans-serif, italic vs. roman, dyslexia-specific fonts) had no impact on classification accuracy — fonts that improve readability for people with dyslexia also benefit readers without dyslexia, so font choice does not help distinguish between groups. When age was removed as a feature, accuracy dropped to 76.39%, indicating that age-related differences in reading performance contribute meaningfully to the model. Individual fold accuracy ranged widely from 61.21% to 96.26%, suggesting some readings are inherently harder to classify — for instance, older adults with dyslexia who have developed compensatory reading strategies. The accuracy is comparable to neuroimaging-based prediction methods (81% from neonatal brain responses), but eye tracking is far less invasive and more affordable.
Relevance
This research opens an important avenue for accessible, scalable dyslexia screening. Dyslexia affects 10-17.5% of the population but is significantly underdiagnosed — often called a "hidden disability" — because traditional diagnosis requires expensive expert assessment. Eye tracking offers a less intrusive alternative: reading a text silently is far simpler than standard diagnostic batteries. As eye trackers become cheaper and more widely available (including webcam-based solutions), this approach could enable early screening in schools and clinics. For web accessibility practitioners, the findings reinforce the importance of text presentation choices — while typeface alone does not predict dyslexia, reading time and fixation patterns clearly differ, underscoring the need for readable, well-structured web content. The work also demonstrates how HCI interaction data can serve diagnostic purposes, pointing toward future tools that could adapt content in real time based on detected reading difficulties.
Tags: dyslexia · eye tracking · machine learning · detection · support vector machine · reading accessibility · learning disabilities