Gold-Standard Evaluation

Also known as: Gold Standard, Reference Standard Evaluation

An evaluation methodology in natural language processing and generation where system output is compared against a set of pre-established correct or ideal responses. In text-based systems, gold-standard strings are human-produced reference outputs that serve as benchmarks. However, this approach is problematic for sign language generation because sign languages lack a standard written form, parallel corpora of English-ASL text do not exist, and actual users of sign language generation systems would consume animations rather than written text. This limitation necessitates user-based evaluation approaches for sign language technology, involving native signers assessing animation quality directly.

Category: Evaluation Methods · Research Methods · Natural Language Processing

Related: Natural Language Generation · Sign Language Generation

Sources

https://doi.org/10.1145/1296843.1296879