LingoLift: Supporting Educators in Personalized Oral Language Teaching for Autistic Children through Content Generation

Jiawen Zhang, Dongyijie Primo Pan, Li Wang, Pan Hui, Xin Tong · 2026 · Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26) · doi:10.1145/3772318.3790284

Summary

This CHI 2026 paper presents LingoLift, a generative AI system that supports special-education teachers and speech-language pathologists in preparing and delivering personalized oral language lessons for autistic children. The authors argue that existing autism-language tech largely positions children as independent users, sidelining educators whose professional judgment is actually what makes one-on-one teaching effective; and that educators spend disproportionate time preparing individualized lesson plans and materials, leaving less time for instruction itself. The work is structured in three phases. A formative study with 30 hours of video observation of 3 educators and 5 autistic children (ages 6-8), analysis of the school's VB-MAPP-based teaching materials, and expert interviews with 3 speech-language pathologists surfaced three themes: systemic burden in lesson preparation, a lack of thematic coherence across lesson segments (articulation, vocabulary, grammar, conversation), and concrete educator strategies for ability-adaptive and interest-integrated teaching. Grounded in VB-MAPP and localized Chinese Articulation assessment, LingoLift combines a React Native teacher tablet, a Node.js/MongoDB vector-store backend using Tencent Hunyuan text and image models with RAG, and a MediaPipe-gesture-controlled classroom projection client. Three design features - Personalized Teaching Content Generation, Coherent Learning Pathway, and AR Card Interface - drive a workflow from child profile to learning objectives, thematic lesson, generated word lists and illustrations, and gesture-controlled delivery. A three-week deployment with 10 educator-child dyads (30 lessons) at a specialized school in Guangzhou evaluated usability, lesson quality, learning outcomes, and educators' adaptive practices.

Key findings

LingoLift reduced lesson preparation time from a mean of 72 minutes to 24.7 minutes (66% reduction, range 10-60 min) while maintaining high perceived quality. Post-study questionnaire means on 1-7 Likert scales were consistently above midpoint: Ease of Use M=5.28 (SD=0.58), Usefulness-Efficiency M=5.08 (0.88), Usefulness-Performance M=4.88 (0.83), Satisfaction M=4.80 (1.00). Lesson Coherence and Personalized Alignment stayed stable across weeks; Child Engagement showed a statistically significant upward trend (Week 1 M=4.20 to Week 3 M=5.10, p<0.05), with objective achievement reaching M=5.27 (0.52). Tact (naming) skill gains were rated highest (M=5.17). Seven of ten children (C1-C4, C7-C9) engaged strongly with the AR projection; three (C5, C6, C10) struggled with virtual content, preferring physical manipulatives. Educators went well beyond the intended functionality: T2 created an immersive "parking-lot" spatial-reasoning game, T1 designed a gesture-driven butterfly-chasing activity, T6 prepared video + pinwheel + mirror multi-sensory supplements for articulation, and T3 introduced olfactory/gustatory anchors via food for zoo-themed sessions. Qualitative analysis revealed that educators actively "negotiated" AI outputs via iterative prompt refinement - T2 asked over ten times for local bus imagery and eventually settled for the closest approximation, while T7 filtered out museum images she deemed unsuitable for easily-frightened autistic children. The authors frame educators' prompt refinement as an interactional layer that steers AI toward contextually and emotionally appropriate materials.

Relevance

For practitioners working on AI-assisted special education, this paper is a concrete demonstration that the most valuable role for generative AI in autism language teaching is augmenting expert educators rather than replacing them. The 66% reduction in preparation time, combined with educators' creative extensions (physical-digital hybrids, multi-sensory compensations, local cultural anchoring), argues for a design stance the authors call "AI as Adaptive Collaborative Partner" with three capabilities: generative flexibility for hot-fixes, behavioural co-regulation (adjusting visual complexity and pacing in response to hyperactivity or attention loss), and embedded expert guidance. The VB-MAPP-grounded, assessment-driven content-generation architecture is directly reusable for other assistive language tools. The paper's documentation of children who do not respond well to virtual content (C5, C6, C10) is an honest limitation: projection-based AI materials risk systematically disadvantaging the most sensory-sensitive or cognitively-support-intensive children, and the authors advocate hybrid physical-digital design and digital-readiness assessment before deployment. Other limits: Guangzhou urban sample, Mandarin-specific phonetics pipeline, three-week horizon with confounding concurrent therapies, and no longitudinal measurement of language development trajectory.

Tags: autism spectrum disorder · personalized learning · generative AI · oral language learning · human-AI collaboration · special education · speech-language pathology · VB-MAPP · educator-centered AI · inclusive education

Standards referenced: VB-MAPP · ABLLS-R