Constitutional AI

Also known as: CAI

A training method introduced by Anthropic in 2022 in which a large language model is aligned to a written set of principles (a 'constitution') through self-critique and reinforcement learning from AI feedback, rather than relying exclusively on human preference labels. The model generates a response, critiques it against the constitution, revises it, and learns from the revision. Constitutional AI is relevant to accessibility because the principles can explicitly include accessibility rules — for example, 'always produce descriptive alt text' or 'never use vague link phrases like read more' — making accessibility a first-class target of model alignment rather than an after-the-fact test.

Category: Artificial Intelligence · Machine Learning · AI ethics

Related: Large Language Model · Reinforcement Learning from Human Feedback

Sources

https://arxiv.org/abs/2212.08073