Diffusion Model

Also known as: Diffusion-based Generator, Denoising Diffusion Model

A diffusion model is a class of generative AI that learns to produce images or videos by iteratively denoising a random noise input, reversing a forward process that gradually adds noise to training data. In accessibility work, diffusion models are used to synthesize sign language videos from pose sequences, generate image descriptions, and create visual content from text prompts. Their strength is producing photorealistic output; their weakness for accessibility is that quality is judged by image-similarity metrics (SSIM, FID, LPIPS) that do not measure whether the output conveys correct meaning — a diffusion-rendered sign video can look smooth and natural while still failing to communicate the intended message to Deaf viewers.

Category: AI accessibility · Machine Learning · AI and accessibility

Related: Sign Language Generation · Large Language Model · Machine Learning

Sources

https://doi.org/10.1145/3772318.3791429