Bias Benchmarking Questionnaire

Also known as: BBQ

A standardized dataset used in AI fairness research to evaluate social biases in language models. The BBQ consists of carefully crafted context-question pairs designed to test whether models exhibit stereotypical associations related to age, gender, race, disability, and other social categories. Each item presents either an ambiguous context (where insufficient information exists to determine an answer) or a disambiguous context (where the answer is clear), allowing researchers to assess whether a model defaults to stereotypes when uncertain. The questionnaire has become a widely used benchmark for measuring and comparing bias across different language models and versions.

Category: Artificial Intelligence · Research Methods

Related: AI Bias · Large Language Model · Ageism

Sources

https://doi.org/10.1145/3663547.3746464