How Does Delegation in Social Interaction Evolve Over Time? Navigation with a Robot for Blind People

Rayna Hata, Masaki Kuribayashi, Allan Wang, Hironobu Takagi, Chieko Asakawa · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791439

Summary

Hata and colleagues run a three-week longitudinal study of how six blind participants delegate social navigation tasks to a guide robot in the Miraikan science museum. The paper pushes back against the default assumption that accessible navigation robots should be as autonomous as possible: fully autonomous systems can strip users of agency, and in practice purely autonomous systems run into the well-documented freezing-robot problem when they encounter dense crowds, queues, or physical obstacles they cannot route around. The authors build on a shared-control paradigm in which the robot and the user jointly negotiate when to act. Their platform is a suitcase-style wheeled robot (built on the open-source AI Suitcase codebase) equipped with LiDAR, three RGB-D cameras, a neck-worn Bluetooth speaker for private audio output, and a loud external speaker for socially directed phrases. A Surrounding GPT function uses GPT-4o to generate on-demand scene descriptions (objects, people, spatial layout); a right-hand button triggers the robot to say 'Excuse me, please move'; and a left-hand button triggers 'Excuse me, please help me'. Each week, participants walked three routes across museum floors that included staged obstacles (green foam block, two suitcases, baby stroller), staged crowds, queues at destinations, and naturally occurring museum visitors. After each session, participants completed RoSAS ratings and five study-specific Likert items on delegation preference, environmental-description quality, and decision-making confidence, followed by semi-structured interviews. Behavioural data were coded from video and system logs to track delegation rates, Surrounding GPT usage, and responses to obstacle-related GPT triggers.

Key findings

Delegation behaviour was highly heterogeneous and evolved over time. Three participants (P2, P3, P4) maintained consistently high delegation rates (83-100%) from Week 1, while others (P1, P5, P6) started at 0-20% and increased across weeks. P5's delegation rate rose from 20% in Week 1 to 100% in Weeks 2 and 3; P6's rose from 0% to 54.5%. Participants reported that the robot's synthetic voice was louder and more captivating than their own, especially in noisy museum settings, and that delegating removed the discomfort of directly asking strangers for help. Surrounding GPT usage shifted from curiosity-driven exploration in Week 1 to targeted, context-specific queries by Week 3 — for example, querying the environment right before deciding whether to ask a crowd to move. Responses to obstacle-related GPT triggers also recalibrated: participants learned to wait through small corrective robot movements (backing up, turning) rather than treating every alert as a call to act, resulting in more selective intervention. Linear mixed-effects models showed a significant positive effect of week on preference for the robot to ask for help (F(2,10) = 13.57, p = .0016), preference for the robot to ask people to move (F(2,10) = 13.21, p = .0014), and ability to understand surroundings via GPT descriptions (F(2,10) = 8.45, p = .0071). Confidence scores trended upward but were not significant as a group. The authors identify three user archetypes: independent-first, balanced, and delegation-first. Unplanned robot failures (getting stuck on slanted walls, misclassifying queues as crowds, non-local-language visitors ignoring requests) surfaced ongoing limitations.

Relevance

For accessibility practitioners, this paper is one of the few longitudinal studies of an assistive robot in the wild, and it challenges the assumption that first-encounter usability studies are sufficient evidence for guide-robot design. The key lesson is that delegation is not a fixed user preference but a dynamic negotiation shaped by environmental context (crowd density, noise), user personality, and accumulated interpretive skill in reading the robot's state. Design implications that travel to other accessibility work include: make corrective or searching behaviour legible so users know whether to intervene; support user-driven follow-up queries rather than one-shot scene descriptions; offer adjustable verbosity; and plan explicitly for multilingual and crowded deployments where the robot's voice alone will not always suffice. The study is limited by a small sample (n=6), a single site, a three-week horizon, and the absence of a no-delegation control condition. It nonetheless provides a useful empirical baseline for a growing class of shared-control guide robots and extends the Miraikan/IBM research programme on accessible museum navigation. Practitioners evaluating any assistive technology that mediates social interactions — not just robots, but also AI image-description apps, on-demand video interpreting, and augmented white canes — should take the longitudinal, context-sensitive framing seriously.

Tags: assistive robotics · navigation · blindness and low vision · visual impairment · shared control · social navigation · longitudinal study · human-robot interaction · delegation · museum accessibility