MotionBuddy: Exploring Tactile-Based Motion Learning with a Tabletop Humanoid Robot for Blind People
Kengo Tanaka, Xiyue Wang, Hironobu Takagi, Yoichi Ochiai, Chieko Asakawa · 2026 · Proceedings of the 21st ACM/IEEE International Conference on Human-Robot Interaction (HRI '26) · doi:10.1145/3757279.3788660
Summary
This HRI 2026 study by the same team behind the earlier 3D tactile-pose work asks whether a tabletop humanoid robot can convey dynamic body movements to blind learners more effectively than audio instruction alone. The motivation is that physical 3D models and tactile graphics can communicate static postures but not transitions or simultaneous limb coordination, while verbal descriptions struggle with timing, spatial relationships, and multi-limb synchrony. A humanoid robot is an embodied, movable demonstrator that a learner can touch repeatedly without the social discomfort of touching a human instructor. The authors built MotionBuddy around a 40 cm AiNex tabletop humanoid, programmed with ROS and Python to execute pre-authored pose sequences. For safety the robot was suspended from a spring-retractable overhead tool balancer so it could move freely without tipping, and polystyrene-foam tactile markers were affixed to shoulders, elbows, hands, knees, and feet so participants could locate joints by touch. During motion, contact was restricted to the robot's hands and feet for safety. Eleven blind participants (seven male, four female; mean age 49) learned two movement themes: a short Bon Odori (Japanese folk-dance) sequence with two poses and a Karate sequence with four poses. Each participant learned every sequence twice, once via narrated audio instruction and once via robot demonstration, with modality order counterbalanced. Objective measures included task completion time and joint-specific reproduction accuracy scored by two blinded raters. Subjective measures included the Raw NASA Task Load Index, the Robotic Social Attributes Scale (Short Form), and custom seven-point Likert items for transition understanding, easiness, usefulness, and overall preference.
Key findings
Robot demonstrations produced significantly higher reproduction accuracy than audio on the two Bon Odori components that depended on simultaneous limb coordination and pose transitions (upper body 1.91 vs 1.41, transitions 0.77 vs 0.27; p < 0.05), while lower-body accuracy was comparable across the two modalities. Upper-body errors (wrong arm abduction, incorrect flare) occurred for seven participants under audio versus one under the robot, and ten of eleven participants made at least one transition error in audio compared to five under the robot. For the more complex but simpler-per-pose Karate sequence, differences were not statistically significant. Completion times did not differ significantly, though the four-pose Karate sequence trended faster under the robot (305 s vs 345 s). Subjective measures strongly favoured the robot: Raw-TLX workload was 43 versus 70 for audio; the robot was rated significantly higher on transition understanding, easiness, usefulness, and overall preference (all p < 0.01); and RoSAS-SF scores showed the robot was perceived as warmer, more competent, and less discomforting than audio instruction. Qualitative feedback emphasised that the robot's held postures enabled repeated confirmation of form, that simultaneous multi-limb relationships were perceivable through touch in a way they were not through sequential speech, and that participants felt more autonomous because they did not need to ask permission to touch the demonstrator. Participants repeatedly requested broader touchable regions (elbows, head), control over tempo, and corrective feedback during practice.
Relevance
For accessibility practitioners, this study is an important datapoint for the growing argument that embodied, touchable demonstrators address specific gaps — transitions, synchrony, timing — that 2D tactile graphics, audio descriptions, and even static 3D-printed models cannot. The social-privacy angle is pragmatically important: several participants explicitly framed the robot as a solution to the awkwardness of asking a human instructor, particularly of the opposite sex, for the close physical contact needed to understand posture. The study is nevertheless exploratory: pre-authored motions, tactile access restricted to hands and feet, 11 participants, and short-term learning only. The findings are most actionable as design guidance — systems should support learner-controlled tempo, dialogic repeat or slow-down requests, corrective feedback, and a thoughtfully negotiated boundary between safe tactile access and mechanical safety. The authors' proposed combination of audio labels and robot verification (or vice versa) is a useful blueprint for multimodal accessible instructional systems across physical education, rehabilitation, and dance. Practitioners should note the field's trajectory: same research team, successive venues (Assets-style 3D models, then TEI 3D-printed poses, now HRI dynamic robots), all converging on multimodal tactile-plus-audio pipelines for accessible movement education.
Tags: human-robot interaction · humanoid robot · assistive robotics · blindness and low vision · visual impairment · motion learning · tactile interaction · physical activity · accessible education · multimodal interaction