Investigating Best Practices for Remote Summative Usability Testing with People with Mild to Moderate Dementia
Rachel Wood, Emma Dixon, Salma Elsayed-Ali, Ekta Shokeen, Amanda Lazar, Jonathan Lazar · 2021 · ACM Transactions on Accessible Computing · doi:10.1145/3460942
Summary
This study is the first to examine best practices for conducting remote summative usability testing with people who have mild to moderate dementia, independent of their caregivers. The research included 15 participants (5 pilot, 10 main study) with an average age of 64, recruited from online dementia communities. Participants tested Morphic, an auto-personalization accessibility tool for Windows, comparing it against built-in Windows 10 accessibility features. The COVID-19 pandemic forced the study to pivot from in-person to fully remote methods via Zoom. This constraint led to the development of two novel testing approaches: the Remote Access method (participant shares their screen, then researcher takes remote control of researcher's computer with software pre-installed) and the Modified Think-Aloud method (researcher shares screen while participant verbally dictates actions). Both methods emerged as on-the-fly adaptations when participants' computers didn't meet system requirements or when software installation caused anxiety. The study analyzed 18 hours and 44 minutes of video recordings using thematic analysis, organizing findings into three categories: planning logistics, conducting the testing, and evaluating results. A scoping review of 65 papers on usability testing with people with dementia found that only 9 of 25 summative studies involved participants independent of caregivers, and none examined remote methods—highlighting a significant gap this research addresses.
Key findings
For planning, researchers must account for participants' varied computing environments, OS versions, and device preferences (some preferred phones to computers). Task descriptions required background scenarios for each task, clear unambiguous language, font sizes larger than 14pt with line breaks, and the option to have tasks read aloud. Participants struggled to differentiate between the application displaying tasks and the system being evaluated, suggesting the first task should be a practice task. For conducting remote testing, three principles emerged: supporting participant agency (always asking permission before making system changes, letting participants decide whether to keep changes, reaffirming their control over continuing or stopping); accommodating mental fatigue (mandatory breaks for sessions over 30 minutes, letting participants choose break length); and reducing anxiety (allowing participants to choose researcher visibility, reframing "testing" language, encouraging experimentation, narrowing task scope when participants became frustrated). The Modified Think-Aloud method unexpectedly created a collaborative "co-discovery" atmosphere that participants found engaging rather than stressful. For evaluation, standard metrics of task and time performance were found inappropriate for this population. Participants' conversational nature, mental fatigue requiring breaks, and variability in short-term memory all distorted time measurements. Task completion was complicated by different task interpretations and "tunnel vision" scanning behaviors. The System Usability Scale worked with modifications—digital administration via Qualtrics was easier than paper, though some participants wanted to comment after each question and struggled with abstract "system" terminology.
Relevance
This research provides actionable guidance for UX practitioners and accessibility researchers who want to include people with dementia in usability testing—a population typically excluded or only involved through caregiver proxies. The 10 summarized lessons learned offer a practical checklist for planning inclusive remote studies. The finding that traditional usability metrics (task completion time, error rate) are inappropriate for this population has significant implications for accessibility evaluation standards. Alternative metrics proposed include task interpretation status, completion status with contextual factors, navigational path analysis, intervention type logging, and qualitative observations of help-seeking behavior. This framework could inform more inclusive usability evaluation approaches beyond dementia. The Remote Access and Modified Think-Aloud methods developed here extend beyond dementia research—they're applicable to any remote study where participants cannot install software, have unreliable internet, lack confidence with technology, or are geographically distant. The emphasis on participant agency, flexibility in methods, and adapting to individual needs models best practices for accessible research design. The study also demonstrates that people with mild to moderate dementia can and should participate directly in technology evaluation rather than being represented solely by caregivers.
Tags: dementia · usability testing · remote research · cognitive accessibility · older adults · research methods · user research · summative evaluation