Voice-Powered Assembly: Boosting Self-Efficacy in Older Adults with "Build2Race"
Noah Zijie Qu, Mark Chignell, Jamy Li · 2026 · ACM Transactions on Accessible Computing · doi:10.1145/3789501
Summary
This study investigates whether voice assistants (VAs) can support older adults in physical product assembly—a largely unexplored use case despite the proliferation of self-assembly products and the importance of such tasks for independent living. The research compares a custom VA prototype called "Build2Race" against a controlled voice call (VC) system simulating traditional phone-based customer support. Eighteen older adults (ages 65-90+) from two Toronto retirement homes assembled a "2RaceWithMe" recumbent exercise bike following step-by-step voice instructions. The VA condition used Google Cloud speech recognition and text-to-speech with a GUI showing progress, while the controlled-VC condition used pre-recorded human audio to eliminate confounding variables between conditions. Both systems followed identical scripts, with participants requesting steps by saying "step [number]" or component information via "item [number]." The study measured assembly performance (completion time, errors), self-efficacy (both domain-specific assembly self-efficacy and general self-efficacy using the GSES), and system usability (modified SUS). Cognitive ability was assessed using the BrainTagger game to control for individual differences. A 1-month follow-up measured whether self-efficacy improvements persisted. The research is grounded in Bandura's self-efficacy theory and the Computers Are Social Actors (CASA) paradigm—suggesting that older adults may respond to VAs similarly to human instructors, with the potential advantage that non-human agents may make users feel more in control of their own performance.
Key findings
Both VA and controlled-VC conditions significantly improved participants' assembly self-efficacy (pre: M=5.9 to post: M=6.8, p<0.01) and general self-efficacy (pre: M=2.9 to post: M=3.0, p<0.01), with no significant difference between conditions immediately after the task. This demonstrates that voice-based instructional systems—whether VA or VC—can provide effective mastery experiences for older adults. The critical finding emerged at the 1-month follow-up: assembly self-efficacy improvements were better sustained in the VA condition (effect size 0.45, 95% CI: 0.1-0.8) compared to the controlled-VC condition, which showed a slight decline. The VA group's assembly self-efficacy continued to rise from post-treatment to follow-up, while the VC group's gains eroded. However, this advantage did not extend to general self-efficacy, which plateaued in both groups. Paradoxically, VA users took significantly longer to complete the assembly (M=27 min) than VC users (M=21 min), primarily due to Google Cloud TTS response delays, though error rates were similar. This highlights that raw task performance metrics may not capture the psychological benefits of VA-assisted assembly. Qualitative interviews revealed that four VA participants spontaneously mentioned the system boosted their confidence by "finding the parts and keeping them in order," while only one VC participant mentioned confidence benefits. Participants with lower cognitive ability (measured by response inhibition) showed greater immediate self-efficacy gains, though this advantage was not sustained at follow-up.
Relevance
This research fills an important gap in accessibility literature by focusing on physical assembly—a prerequisite for actually using many products that is often overlooked in favor of post-setup information tasks. As the authors note, many accessibility failures occur at the physical assembly stage before the device is even in use; if older adults cannot assemble exercise equipment, furniture, or medical devices, they cannot benefit from them. The key insight for practitioners is that VA-based instruction may promote more durable self-efficacy gains than human-delivered phone support, possibly because users feel more "in charge" when interacting with a non-human agent. This suggests VAs could supplement or replace customer service representatives for assembly assistance, with potential cost savings and 24/7 availability. The study offers specific design guidelines for VAs targeting older adults: (1) provide clear, structured step-by-step instructions without overwhelming detail; (2) minimize response latency as delays frustrate users and obscure VA benefits; (3) pair voice with visual progress tracking (the VA included a GUI showing completed steps); (4) use natural language and avoid technical jargon; (5) consider physical limitations like vision and mobility when designing interaction modalities; (6) allow voice commands that do not require precise articulation. The research also demonstrates that physical assembly deserves more attention in HCI accessibility research, which has predominantly focused on information-seeking and cognitive tasks rather than hands-on physical activities essential for aging in place.
Tags: older adults · voice assistants · self-efficacy · aging in place · assembly tasks · independence · mastery experience
Standards referenced: SUS · GSES