Auto-Generating Personas from User Reviews in VR App Stores

Yi Wang, Kexin Cheng, Xiao Liu, Chetan Arora, John Grundy, Thuong Hoang, Henry Been-Lirn Duh · 2026 · Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI EA ’26) · doi:10.1145/3772363.3798395

Summary

This CHI 2026 Extended Abstract reports on an auto-generated persona system developed to help undergraduate students surface accessibility requirements in virtual reality design projects. The authors argue that personas are well-established in user-centered design and requirements engineering, but that constructing accessibility-focused personas for VR is particularly difficult: students lack access to disabled users, VR-specific accessibility barriers (motion sickness, spatial navigation, controller dependence) differ from desktop or mobile, and data analysis skill gaps often lead novices to fabricate superficial personas. The system scrapes user reviews from the 50 most popular Meta Quest Store and Steam VR applications, filters for accessibility-relevant reviews using WHO-aligned disability keywords and fuzzy matching, and removes advertisements, non-English content, and discriminatory language — yielding 396 high-quality reviews. Review chunks are embedded with a sentence-transformer model and stored in a Chroma vector database. A GPT-4o backend in a Retrieval-Augmented Generation (RAG) pipeline retrieves the top semantically relevant review segments for the student's VR project type and disability group, extracts structured dimension-value pairs, and assembles a persona with biography, pain points, grounded quotes, and accessibility requirements. DALL·E 3 generates profile photos. The authors evaluated the system in a two-week VR course with 24 students in a counterbalanced within-subjects study comparing the auto-generated persona condition against a traditional survey-based persona approach. Empathy was measured using the Interpersonal Reactivity Index (perspective taking, empathic concern, fantasy subscales), supplemented by semi-structured group interviews analyzed thematically in MAXQDA.

Key findings

The auto-generated persona system produced significantly higher empathy scores than the survey-based condition on overall empathy (t = 2.989, p = .015; system M = 4.45 vs survey M = 3.06), perspective taking (t = 3.715, p = .004; 4.65 vs 3.25), and empathic concern (t = 2.515, p = .033; 4.35 vs 2.85). The fantasy subscale trended higher for the system (4.15 vs 3.10) but did not reach significance. Qualitative interviews converged on three patterns. First, students reported that personas grounded in real VR store reviews felt less abstract than survey-derived or fictional personas, with several students explicitly saying it was the first time they recognized that real people with disabilities use the VR applications they were designing. Second, the system triggered ethical self-reflection: students questioned whether their own design choices had contributed to inaccessible experiences. Third, participants suggested that the persona format alone was insufficient for full imaginative engagement and asked for scenario-based simulations, visualized pain-point maps, and first-person VR explorations of accessibility constraints. The authors note important caveats: empathy can lead to emotional over-identification and misread user needs (per Bennett & Rosner 2019), LLM+RAG pipelines still risk reproducing stereotypes embedded in review data, and the system was not evaluated for implicit or explicit bias change.

Relevance

For accessibility educators and VR practitioners, this paper offers a concrete template for introducing disability-informed requirements work into design courses without requiring students to recruit disabled participants — a significant barrier in undergraduate settings. The RAG-over-app-store-reviews approach is directly adaptable to mobile and desktop contexts and shows how existing public review data can scaffold persona generation that is grounded rather than fabricated. For practitioners, the study reinforces that VR carries distinct accessibility concerns (motion sickness, controller dependence, spatial navigation) that generic guidelines do not fully address, and that these concerns surface readily in app-store reviews if developers look. Limitations are significant: n=24 in a single class, short engagement, no bias measurement, and the well-known risk that LLM-generated personas encode reviewer stereotypes rather than lived experience. Direct engagement with disabled users remains the gold standard.

Tags: virtual reality · VR accessibility · personas · requirements engineering · large language models · retrieval-augmented generation · accessibility education · user-centered design · empathy

Standards referenced: WCAG