Not Seeing the Whole Picture: Challenges and Opportunities in Using AI for Co-Making Physical, DIY-AT for People with Visual Impairments

Ben Kosa, Hsuanling Lee, Jasmine Li, Sanbrita Mondal, Yuhang Zhao, Liang He · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791815

Summary

This CHI 2026 paper explores whether an LLM-based conversational agent can act as a co-making partner — not just a visual aid — when people with visual impairments (PVI) build their own physical assistive technology. The authors extended their prior A11yBits tangible toolkit with a GPT-4o assistant called A11yMaker AI that provides brainstorming, toolkit expertise, module recognition via ArUco markers, and programmatic configuration of event-based if/then logic across six sensing modules (Temperature, Distance, Motion, Light, Posture, Camera) and four feedback modules (Vibration, Sound, LED Display, Motor). Nine PVI participants (5 low vision, 3 blind, ages 26-82) completed 150-minute lab sessions — pre-study interview, tutorial with a sample bag-moved alert task, two self-defined solution-building tasks, and post-task interview. Sessions were recorded, transcribed, and coded with reflexive thematic analysis by two researchers using 486 quotes to build an iterated codebook. Across the nine sessions participants co-made 14 working DIY-ATs spanning navigation (bus stops, doors, outlets), item search (shampoo, keys, water bottles, TVs), and environmental awareness (person detection, candle-out alerts, room temperature). The work extends prior research on Plug-and-play accessibility toolkits and generative AI for PVI creativity by focusing on the embodied, physical fabrication process rather than software-only workflows.

Key findings

A11yMaker AI succeeded as an always-available patient tutor, general-knowledge expert, and brainstorming bridge — participants praised being able to ask questions at their own pace rather than scheduling time with sighted experts. But four breakdown patterns emerged. (1) AI overlooking toolkit limitations: the assistant proposed solutions outside the hardware's capabilities (e.g., recognizing bus-stop signs) and surfaced the limits only when participants pushed back. (2) AI oversimplifying PVI needs: suggesting camera object detection for finding outlets ignored that a PVI user also needs direction and distance guidance to reach the object, not just confirmation that it exists in the frame. (3) Hallucinated capabilities: the assistant claimed it could train the camera on novel classes, report temperature through the sound module, or locate where a bottle was in 3D space based on sound alone — none of which the toolkit supports. (4) Failure to resolve ambiguity: on vague or unconventional prompts the AI answered literally rather than asking clarifying questions, leaving some participants with unusable solutions. Beyond AI errors, physical co-making surfaced a hard spatial/visual scaffolding gap: participants could not verify whether modules were correctly oriented, still powered, within sensor range, or even attached — and the AI provided no proactive situational feedback, so participants relied on sighted researcher intervention. Most participants explicitly wanted the AI to act like a 'sighted guide'.

Relevance

For accessibility practitioners designing AI assistants for blind users or DIY assistive-technology toolkits, this paper reframes the design target: the hard problem is not language interface but embodied, situated collaboration. Three concrete design moves: (1) design for imperfect AI — treat hallucinations as inevitable and add verification affordances (program verification, capability schemas, multi-agent cross-checking, selective verification of toolkit claims) rather than trying to eliminate errors; (2) balance power and abstraction — let users inspect, contest, and step down into lower-level detail when needed rather than only speaking natural language; and (3) evolve toward multimodal 'sighted guide' AI that uses cameras, AR glasses, or wearables to give proactive spatial feedback about module placement, orientation, and state. The findings also warn against the 'one-size-fits-all' AT trap and reinforce that DIY toolkits for PVI must treat physical constraints — orientation, line-of-sight, battery, attachment — as first-class accessibility concerns. Limitations: lab-only study, 9 participants, constrained event vocabulary, no longitudinal real-world deployment.

Tags: blindness · low vision · DIY assistive technology · tangible interaction · Generative AI · Large Language Model · accessibility prototyping · human-AI collaboration · hallucinations