Detecting Linguistic HCI Markers in an Online Aphasia Support Group
Yoram M. Kalman, Kathleen Geraghty, Cynthia K. Thompson, Darren Gergle · 2012 · Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2012) · doi:10.1145/2384916.2384928
Summary
This paper investigates whether the language deficits associated with aphasia — an acquired language disorder typically resulting from stroke or brain injury — can be detected in online written communication. The concept of "HCI markers" is introduced: measurable signals generated during human-computer interaction that may reveal information about a user's cognitive, psychological, or physiological state, analogous to how biomarkers like blood sugar levels indicate disease processes. The researchers analyzed 150 messages (14,754 words) posted to a public online aphasia support forum by six people with aphasia and four control participants (relatives or a speech-language pathologist spouse). Messages were segmented into utterances, coded for parts of speech using the Penn Treebank framework, and analyzed for errors including grammatical errors, morpheme inflection errors, and errors in open class words (nouns, verbs, adjectives, adverbs) and closed class words (prepositions, articles, conjunctions). Seven linguistic variables were tested using conservative nonparametric Wilcoxon signed-rank tests to compare the two groups.
Key findings
Five of the seven linguistic variables showed significant differences between people with aphasia and controls (p ≤ 0.01), establishing them as candidate HCI markers for aphasia. People with aphasia had significantly shorter mean length of utterance (8.72 vs 14.75 words), higher rates of ungrammatical sentences (31.17% vs 5.28%), more morpheme inflection errors (7.40% vs 1.05%), more open class word errors (4.03% vs 0.03%), and more closed class word errors (4.03% vs 0.36%). Two variables — the open/closed class word ratio and noun/verb ratio — did not differ significantly between groups. Notably, despite having effectively unlimited time to compose online messages (unlike pressured spoken conversation), people with aphasia still produced shorter utterances and more errors, demonstrating that key aphasic language characteristics persist in asynchronous computer-mediated communication. There was also high variability within the aphasia group, consistent with the heterogeneous nature of the condition.
Relevance
This pioneering study opens up the possibility of using naturally occurring online text as a low-cost, unobtrusive tool for monitoring language abilities in people with aphasia. For accessibility practitioners, the five identified HCI markers could inform the design of adaptive interfaces that detect language difficulties and adjust accordingly — for example, simplifying text input options or offering communication support when linguistic markers suggest a user may benefit from it. The research also has implications for longitudinal health monitoring: analysing archived emails, social media posts, or forum messages could help clinicians track language recovery after stroke or detect progressive decline in conditions like Primary Progressive Aphasia. The study is limited by its small sample size and reliance on self-reported diagnoses, but it establishes an important proof of concept for using digital communication patterns as health-related signals — a concept with broad applicability beyond aphasia to other cognitive and neurological conditions.
Tags: aphasia · computer-mediated communication · linguistic analysis · online support groups · user modeling · health monitoring · stroke