American Sign Language natural language generation and machine translation

Matt Huenerfauth · 2005 · SIGACCESS Accessibility and Computing · doi:10.1145/1055674.1055676

Summary

Matt Huenerfauth's 2005 paper describes a research programme to build an English-to-American Sign Language (ASL) machine translation (MT) system that generates animations of a 3D virtual-reality signing character. The author frames the project against a stark literacy gap: most deaf U.S. high school graduates read English at roughly a fourth-grade level, so accessibility aids that assume strong English literacy — closed captioning, teletype (TTY) telephones — often fail the users they are meant to serve. Because many deaf people with limited English are fluent in ASL, an automated English-to-ASL translator could provide meaningful access where captions are too complex or human interpreters are unavailable. The paper surveys the linguistic obstacles that distinguish ASL from spoken-language translation, focusing on 'classifier predicates' — productive hand movements that trace spatial paths, contours, or locations in the three-dimensional space around the signer. Classifier predicates let a signer render 'the car drove up the hill' as a single hand tracing a 3D path, something no prior English-to-ASL MT system could generate because existing text-based MT architectures do not model 3D spatial arrangement. Huenerfauth's proposed architecture addresses this by coupling 'natural language control' virtual reality software, which turns English descriptions into 3D scene movements, with a multi-path translation pipeline that routes sentences containing classifier predicates through the VR model and sends simpler sentences through conventional MT.

Key findings

The paper's principal contribution is an architectural proposal rather than empirical results — this is a dissertation-stage progress report, not an evaluation study. Huenerfauth argues that the English/ASL sentence pairs most valuable to deaf users with limited English literacy are precisely the pairs that prior MT systems cannot handle, because they involve spatial descriptions requiring classifier predicates. Sign-frequency studies cited in the paper indicate that ASL signers produce between one and seventeen classifier predicates per minute depending on genre, so ignoring them yields output of very limited fluency. The author reports three concrete accomplishments: (1) a survey and critique of previous English-to-ASL MT systems showing that none model 3D spatial arrangement; (2) a comparison of competing linguistic theories of classifier predicates, selecting a model that accounts for human-produced data while minimising implementation overhead; and (3) a multi-path translation architecture designed so that the classifier-predicate generator can be bolted on top of existing English-to-ASL MT systems. Future work planned for spring 2005 includes initial implementation and evaluation by native ASL signers from the American Deaf community, who will guide refinements and help select an initial application domain such as educational software for deaf students or translation of spatially descriptive text.

Relevance

This early paper is foundational reading for anyone working at the intersection of sign language technology and accessibility, and it remains relevant two decades later because the core problem — generating linguistically rich ASL rather than word-for-word signed English glosses — is still unsolved in commercial signing-avatar products. For practitioners, the most useful takeaway is the literacy-gap framing: English captioning is not a universal solution for deaf users, and accessibility teams deploying caption-based or text-based aids should understand that a substantial portion of their deaf audience may not benefit from English-centric solutions. The paper also usefully distinguishes signing-avatar work aimed at accessibility from novelty animation projects by insisting on linguistic fidelity as the correctness criterion. Limitations are the paper's small scope — four pages, no implementation, no user evaluation — and its age; readers should pair it with more recent work on neural sign-language translation, signing-avatar acceptability research, and Deaf community critiques of avatar technology.

Tags: ASL · American Sign Language · deaf accessibility · sign language · sign language animation · sign language generation · sign language machine translation · machine translation · natural language generation · signing avatar · animation · computational linguistics · virtual reality · English literacy · classifier predicates