Speaker Segmentation

Also known as: Person Segmentation, Human Segmentation

The process of identifying and isolating the speaker or presenter in a video frame, separating them from the background and other visual elements. Speaker segmentation uses computer vision models to create precise masks around the speaker, enabling layout customization options like enlarging the speaker, removing the background behind them, or focusing exclusively on the speaker while removing overlays. This technology enables accessibility features like Speaker Focus mode, which helps viewers with ADHD concentrate on the speaker by minimizing surrounding visual distractions.

Category: computer vision · video processing

Related: Object Detection · Video Customization · Speaker Focus · Background Blur

Sources

https://dl.acm.org/doi/10.1145/3663547.3746386