Red-teaming

Also known as: Red team testing

A structured adversarial-testing practice in which a dedicated team deliberately attempts to break, manipulate, or provoke harmful outputs from a system — originally from military strategy, now widely used for AI systems including large language models. In an accessibility and public-service context, red-teaming is used to surface biases, failure modes on atypical speech or language, jailbreak susceptibility, and risks to vulnerable users before deployment. It complements but does not replace participatory engagement with affected disabled communities, who often identify harms that professional red-teamers miss.

Category: AI · governance · testing

Related: Model Cards · Algorithmic Decision-Making · Prompt Injection

Sources

https://doi.org/10.48550/arXiv.2202.03286
https://doi.org/10.1145/3772318.3791025