FocusView: Understanding and Customizing Informational Video Watching Experiences for Viewers with ADHD
Hanxiu Hazel Zhu, Ruijia Chen, Yuhang Zhao · 2025 · ASSETS 2025: 27th International ACM SIGACCESS Conference on Computers and Accessibility · doi:10.1145/3663547.3746386
Summary
This paper presents FocusView, an AI-powered video customization interface designed to help viewers with ADHD reduce distractions and maintain focus when watching informational videos. The research addresses the critical gap that while videos have become a dominant medium for educational and professional content delivery, their dynamic multimodal nature—combining speakers, presentation screens, overlays, backgrounds, captions, and audio—can create significant attention challenges for people with ADHD. The researchers first conducted a formative study analyzing over 350 ADHD-relevant videos and 7,000+ viewer comments on YouTube and TikTok, identifying six types of video distractions: speaker appearance, content overlays, auxiliary information (watermarks, ads), background visuals, caption presentations, and background audio. Based on these findings, they designed FocusView to enable customization across four aspects. Layout customization offers three simplified views: Speaker Focus (enlarging the speaker while removing overlays), Content Focus (enlarging presentation content while removing other elements), and Auxiliary Removal (removing non-essential overlays). Background customization allows blurring or replacing the background with solid colors. Caption customization provides control over color, font style (including Bionic Reading font), size, position, and a Dynamic Caption Tracking feature that highlights the currently spoken word. Audio customization removes background sounds while enhancing speech clarity. The system uses state-of-the-art computer vision (YOLO11, SAM2, TranSalNet) and audio processing models to segment and modify video elements.
Key findings
FocusView was evaluated with 12 participants with ADHD (ages 19-57, 11 clinically diagnosed) who customized seven short informational videos across educational, casual, and news categories, plus four long multi-scene videos. FocusView significantly improved perceived video viewability (F = 165.4, p < 0.001, large effect size of 0.75), with a mean effectiveness rating of 6.17/7. Participants found it easy to use (mean 6.67/7) with low customization effort (mean 1.54/7). Key preference findings revealed highly individualized and context-dependent choices: Auxiliary Removal was the most popular layout option (50-66.7% across video types); 70% preferred blurring over removing backgrounds to preserve context; 75% turned on captions for all videos, with Dynamic Caption Tracking being particularly valued by four participants who applied it consistently. Audio customization had the most consistent preferences, with 75%+ removing background music from casual videos and five participants denoising all videos. Critically, the study revealed the dual nature of video elements for ADHD—background music was perceived as distraction by some but stimulation boost by others, and the customization process itself could become a new source of distraction. Participants appreciated limited options to reduce decision fatigue but also desired more granular control (e.g., selectively blurring specific background objects). For long videos, participants preferred ad-hoc customization while watching (75%) over pre-customizing, and identified strategies like merging scenes with similar layouts to reduce customization workload. Concerns included processing wait time (30 seconds max preferred), potential information loss from AI modifications, and ethical issues around AI-generated replacement visuals.
Relevance
FocusView represents an important advance in cognitive accessibility for video content, demonstrating that AI-powered customization can meaningfully improve the video watching experience for people with ADHD. The research has several practical implications for video platform designers. First, it provides evidence that video accessibility for ADHD requires different approaches than traditional accessibility—rather than adding information (captions, descriptions), it often involves removing or reducing distracting elements. Second, the finding that customization preferences are highly individual and context-dependent argues against one-size-fits-all solutions and in favor of user-controlled customization with sensible defaults. The tension between customization flexibility and cognitive load (the "customization paradox") is a crucial design insight: too many options can themselves become a distraction for ADHD users. The design implications—ML-driven preference profiles, limited but flexible options, minimal processing time, and visual indicators for AI modifications—provide a practical roadmap for implementing ADHD-friendly video features. Limitations include the controlled lab setting, pre-processed videos only, and focus on rectangular overlay detection. Future work should address real-time processing, irregular visual elements, and longitudinal real-world deployment.
Tags: ADHD · video accessibility · video customization · distraction reduction · cognitive accessibility · computer vision · assistive technology · user preferences · informational videos