Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective
Danielle Bragg, Oscar Koller, Mary Bellard, Larwan Berke, Patrick Boudreault, Annelies Braffort, Naomi Caselli, Matt Huenerfauth, Hernisa Kacorri, Tessa Verhoef, Christian Vogler, Meredith Ringel Morris · 2019 · Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2019) · doi:10.1145/3308561.3353774
Summary
This paper provides a comprehensive interdisciplinary overview of sign language processing — encompassing recognition, generation, and translation — produced through a two-day workshop bringing together 39 experts from computer science, linguistics, Deaf studies, and industry. The authors argue that existing research in sign language technology occurs in disciplinary silos, with computer vision researchers building recognition systems without understanding Deaf culture or sign language linguistics, and linguists working without awareness of computational constraints. The workshop addressed three questions: what the current interdisciplinary landscape reveals, what the biggest challenges are, and what calls to action the community should pursue. The paper provides essential background often overlooked by technologists, including the cultural significance of sign languages to Deaf communities, the history of audism and language suppression, and the linguistic complexity of sign languages — which are not manual versions of spoken languages but distinct natural languages with their own phonology, grammar, and regional variation. There are over 300 sign languages worldwide used by 70 million deaf people. The paper reviews five technical domains: datasets, recognition and computer vision, modeling and NLP, avatars and computer graphics, and UI/UX design, synthesizing the state of each field and identifying where they intersect.
Key findings
The paper identifies several critical challenges. Sign language datasets are orders of magnitude smaller than speech corpora — the largest contain fewer than 100,000 articulated signs compared to billions of words in speech datasets. Most datasets lack diversity in signers, use non-native signers, and focus predominantly on ASL, leaving hundreds of other sign languages unrepresented. Recognition systems still struggle with continuous signing, achieving word error rates of 22.9% even on limited vocabularies when tested on the same signers used for training. Sign language avatar generation remains partially manual, with no fully automated pipeline for producing natural-looking signing. The uncanny valley effect is a particular challenge — avatars must convey meaningful facial expressions (raised eyebrows for questions, specific mouth movements) while also appearing natural. A key structural insight is that sign languages differ fundamentally from spoken languages in simultaneity (conveying multiple streams of information at once), spatial organization, and the use of depiction, making direct application of NLP methods designed for spoken/written language often ineffective. The paper culminates in five calls to action: involve Deaf team members throughout, focus on real-world applications, develop UI guidelines for sign language systems, create larger representative public datasets, and standardize annotation systems.
Relevance
This paper is essential reading for anyone working at the intersection of AI and accessibility, particularly those developing communication technologies for Deaf and hard of hearing users. Its central message — that sign language technology cannot succeed without deep Deaf community involvement and interdisciplinary collaboration — challenges the common approach of treating sign language recognition as a pure computer vision problem. The calls to action remain highly relevant: most sign language AI projects still lack meaningful Deaf participation, datasets remain small and unrepresentative, and there is still no standard annotation system. For accessibility practitioners, the paper provides crucial context about why "sign language gloves" and similar technologies have repeatedly failed to gain community adoption — they are often built by hearing teams without understanding what Deaf users actually need. The emphasis on respecting Deaf ownership of sign languages and avoiding cultural appropriation in dataset creation sets an important ethical framework for future work.
Tags: sign language · Deaf culture · computer vision · natural language processing · machine translation · sign language recognition · sign language generation · avatars · ASL · interdisciplinary research · dataset challenges