Debiasing

Also known as: Bias mitigation, Bias correction

Debiasing refers to techniques and processes applied to AI systems—particularly machine learning models and large language models—to detect, reduce, or eliminate unfair biases that cause the system to produce outputs that discriminate against or misrepresent specific demographic groups. Debiasing methods include pre-processing approaches (modifying training data), in-processing approaches (adjusting training objectives or architectures), and post-processing approaches (filtering or adjusting model outputs). In practice, debiasing is complex: reducing negative stereotypes can inadvertently introduce new forms of harm through overcompensation, where models produce unrealistically positive portrayals that themselves misrepresent the lived realities of marginalized groups. For people with disabilities, AI systems equipped with debiasing guardrails may generate inspiration porn-style content or suppress authentic expressions of pain and struggle, replacing them with sanitised narratives. Effective debiasing for accessibility requires community involvement, nuanced evaluation metrics, and an understanding that authentic representation must include the full spectrum of human experience.

Category: AI bias · machine learning · inclusive design

Related: AI bias · Large language model · Toxic positivity · Disability representation · Overcompensation

Sources

https://doi.org/10.1145/3806202