← All reviews

ChatWoz: Chatting through a Wizard of Oz

Pedro Fialho, Luísa Coheur · 2015 · ASSETS '15: Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility · doi:10.1145/2700648.2811334

Summary

ChatWoz is a Wizard of Oz system designed to enable autistic children to interact with caregivers through a virtual avatar. The system was motivated by widely reported cases of autistic children enthusiastically engaging with commercial virtual assistants like Apple's Siri and Microsoft's Cortana—interactions that suggest some children on the autism spectrum may find computer-mediated communication more comfortable than direct human interaction. In the Wizard of Oz paradigm, users interact with what appears to be an autonomous computer system, unaware that a human operator is actually controlling the responses. ChatWoz applies this concept to child-caregiver communication: the caregiver controls a backend interface that determines what the avatar says, while the child sees only the animated virtual agent speaking with synthesized speech. This creates a form of mediated communication where the predictability and consistency of the virtual character may reduce social anxiety for the child. The system was built using Unity for 3D rendering and animation. Text-to-speech output is synchronized with lip movements through phoneme-to-viseme mapping. Caregivers can type custom responses or select from predefined utterances organized into categories, enabling rapid response during real-time interaction. Three avatar characters are available (Catarina, Filipe, and Edgar), and agents can be swapped during a session. The background environment can be customized, and caregivers can optionally view and hear the child through audio/video feeds.

Key findings

The system supports four emotional states for the avatar—neutral, joyful, sad, and surprised—that can be applied both when the agent is speaking and when it is "listening" to the child. After expressing a listening emotion, the avatar returns to its neutral state. This emotional expressiveness addresses research suggesting that clear, exaggerated emotional cues may be easier for some autistic individuals to interpret than subtle human facial expressions. The dual-purpose design serves both research and practical needs. As a research tool, ChatWoz collects naturalistic dialogue data that can inform the development of autonomous conversational agents. As an intervention tool, it provides a structured communication channel that may help children who struggle with direct social interaction. The predefined utterance categories allow caregivers to respond quickly, maintaining conversational flow while reducing the cognitive load of typing during interaction. Preliminary testing revealed technical limitations: video transmission through Unity caused significant latency, slowing the interface and potentially disrupting the real-time nature of the interaction. Audio transmission posed similar but smaller-scale problems. The researchers concluded that Unity is not suitable for video transmission and noted this as an area requiring alternative solutions.

Relevance

ChatWoz represents a creative application of the Wizard of Oz methodology to autism intervention, combining research infrastructure with therapeutic potential. The approach acknowledges that while fully autonomous conversational AI for children with autism remains challenging, human-in-the-loop systems can provide immediate practical value while generating data to advance automation. The observation that autistic children often engage more readily with virtual assistants than with humans has significant implications for communication support design. Rather than viewing technology-mediated interaction as inferior to "natural" communication, this work treats it as a valid and potentially preferable modality for some individuals. This perspective aligns with neurodiversity-informed design that accommodates different communication styles rather than exclusively promoting neurotypical interaction patterns. For practitioners working with autistic children, the system offers a tool for structured social interaction practice, with the caregiver maintaining control over pacing, emotional tone, and content while the child experiences the interaction as engagement with an autonomous character. Future work adding emotional prosody to the synthesized voice could enhance the system's ability to model emotional expression in speech.

Tags: autism · Wizard of Oz · virtual agent · dialogue systems · child-caregiver interaction · text-to-speech · emotion · avatar · AAC