GenRole: Personalizing Role Play for Educators Supporting Autistic Students' Social Interaction Learning

Yixuan Li, Keyi Zeng, Jiaqi Zong, Yingying Zhang, Hongzhu Deng, Li Wang, Xin Tong · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3791948

Summary

This CHI 2026 paper introduces GenRole, a generative-AI system that helps educators build personalised role-play activities for teaching social interaction skills to autistic children. Motivated by the gap between role-play as a well-evidenced teaching method for autistic learners and the heavy manual burden of producing bespoke scenarios, characters, scripts and visual props, the authors built a web tool that lets teachers enter a target social skill and a child profile and then receive a personalised role-play script (GPT-4), character and background visuals (DALL-E 3), voice audio (iFLYTEK) and printable 'tangible cards', wrapped in a Unity-based classroom role-play stage. GenRole organises role-play into four progressive modes - Echo Mode (teacher reads the script while the child listens), Dialogue Mode (scripted back-and-forth), Character Mode (focus on visual cues) and Exploration Mode (open-ended prompts) - so difficulty can be graduated from concrete to abstract. Design was grounded in a formative study with three instructional supervisors, refined via a pilot study with 16 teachers (System Usability Scale, satisfaction Likert items), and evaluated in a two-week main user study with 11 teacher-student pairs in a Chinese special-education school (autistic boys aged 9-15, three role-play classes per child, pre/post teacher-rated social-skill scales, 40-minute semi-structured interviews analysed via thematic analysis). The work uses identity-first language and is explicitly framed as empowering autistic children to explore social dynamics 'on their own terms' rather than training them to act typical.

Key findings

Teachers reported significant pre/post gains in observed social-skill performance (post M = 3.03 vs pre M = 2.18 on a 5-point Likert scale, Wilcoxon z = 4.35, p < 0.01). GenRole's System Usability Scale score was 76.88 in the main study (73.42 in the pilot), landing in the 'good' usability range. The progressive four-mode design was praised for scaffolding rehearsal and generalisation; teachers reported that children moved from concrete Echo Mode through Dialogue Mode into more abstract exploration, with some transferring skills to real-world situations (e.g., sharing at a school gate). Echo and Dialogue Mode kept attention best; Exploration Mode was harder for some children, prompting the team to add Character Mode in a mid-study iteration. Teachers strongly valued (a) personalised characters and scenes tied to a child's interests (e.g., aquariums, a red-hair-clip character) for engagement and (b) reduced prep time - generating scripts and images in minutes rather than a week of manual material collection, with less-experienced teachers (< 3 years) rating GenRole especially positively (SUS 78.33 vs 65 for more experienced teachers). Tangible printed cards improved focus for some children but triggered problem behaviours (e.g., tearing paper) in others. Cartoon visuals appealed broadly but felt mismatched for children with cognitive impairments, who preferred realistic imagery. Teachers asked for more granular personalisation across developmental stages (past/present/future), a fifth mode covering varied scenarios for the same skill, and home-based practice support involving parents.

Relevance

For accessibility practitioners in education and assistive-technology teams, GenRole is a concrete worked example of how generative AI can reduce the preparation burden that keeps individualised social-skill instruction out of reach in time- and staff-constrained classrooms. The design takeaways travel beyond autism: graduated modes from concrete to abstract, tangible printable companions to an on-screen experience, personalisation tied to a child's known life experiences, and a teacher-facing 'manual edit' window on all AI output are broadly applicable to inclusive educational tooling. The paper is also useful as a case study in responsible GenAI reporting: the authors include an explicit Disclosure about Use of Large Language Models section covering GPT API, ChatGPT, and DALL-E 3 use. Key limitations: the user study ran only three classes per child, all participants were boys with basic learning ability in a single Chinese special-education school, and cultural/educational context (IEP availability, teacher training, Chinese social norms) strongly shapes transferability. The authors rightly caution that 'one-size-fits-all' AI tools developed in high-income Western settings need to be co-adapted with local stakeholders rather than dropped into new contexts.

Tags: generative AI · personalization · social skills training · autistic children · role play · special education · large language models · educational technology · inclusive education · assistive technology