Inter-Annotator Agreement

Also known as: IAA, Inter-rater agreement, Inter-coder agreement

A statistical measure of how consistently two or more human annotators assign the same label to the same data item, widely used in NLP, computer vision, and AI dataset construction as a proxy for label quality. Common measures include Cohen's kappa, Fleiss' kappa, and Krippendorff's alpha. Low inter-annotator agreement can indicate ambiguous data, unclear guidelines, or genuine human disagreement rooted in different lived perspectives; in accessibility datasets it is often a signal that annotators lack the embodied knowledge needed to interpret the data, rather than a defect to be averaged away through majority voting.

Category: Research Methods · Datasets · Statistics · AI ethics

Related: Data Annotation · Ground Truth · Disability-First Dataset

Sources

https://doi.org/10.1145/3772318.3790405