The Development of Language Processing Support for the ViSiCAST Project

R. Elliott, J. R. W. Glauert, J. R. Kennaway, I. Marshall · 2000 · Proceedings of the Fourth International ACM Conference on Assistive Technologies (Assets '00) · doi:10.1145/354324.354349

Summary

This paper from the University of East Anglia describes early work on the ViSiCAST project (Virtual Signing, Animation, Capture, Storage and Transmission), a 3-year EU-funded initiative to provide improved access to services for Deaf citizens through sign language presented by a virtual human avatar. The project builds on earlier UK projects that developed signing avatars named Simon, Tessa, and Visia using motion capture technology — cybergloves with 18 resistive elements per hand, Polhemus magnetic sensors for wrist, arm, head and torso tracking, and a helmet-mounted camera with infrared LEDs for facial expression capture. The paper focuses on two critical language processing challenges: developing an XML-compliant notation for sign language gestures (Signing Gesture Markup Language, or SiGML) that can drive the avatar, and developing a framework for translating English text into sign language via this notation. A key insight is that natural sign languages like BSL have their own morphology, phonology, and syntax fundamentally different from spoken languages — they are inherently multimodal, exploiting hand shape, position, orientation, movement, facial expression, and body posture simultaneously. The translation approach uses Discourse Representation Structures (DRSs) as an intermediate representation between English text and sign language output, allowing the same DRS to be converted to multiple national sign languages (BSL, German Sign Language/DGS, Sign Language of the Netherlands/NGT). The paper also describes the TESSA system, developed with the UK Post Office, which translates clerk speech to BSL signing via avatar for counter transactions.

Key findings

The project established that converting English text to Sign Supported English (SSE) — which follows English word order — was technically feasible but not what pre-lingually Deaf users actually wanted; they preferred natural sign languages like BSL with their own grammar and syntax. This was a pivotal finding that redirected the project from SSE to BSL translation. The SiGML notation was designed as an XML representation building on HamNoSys (Hamburg Notation System), a well-established phonetic notation for sign languages, extending it to be machine-processable and capable of driving avatar animation. The notation operates at multiple levels: glossing (English word or phrase equivalents), phonology, phonetics, and physical articulation including motion capture data. The TESSA Post Office system was evaluated by six pre-lingually deaf people and three post office clerks, who found it encouraging and constructive while identifying that facial expression and handshape clarity were critical for comprehension. The translation framework identified several BSL-specific challenges: classifier handshapes that incorporate pronominal references into verb signs, temporal phenomena that differ fundamentally from English tense systems, and grammatical distinctions (like one-to-many vs. one-to-one relationships) that are ambiguous in English but must be resolved for BSL. The authors acknowledged that fully automatic high-quality translation would require human intervention for ambiguity resolution.

Relevance

The ViSiCAST project was a landmark initiative in sign language technology that laid groundwork still relevant to modern avatar-based signing systems. The SiGML notation influenced subsequent sign language representation standards and avatar technologies. The finding that Deaf users prefer natural sign language over signed versions of spoken languages remains a critical principle for any sign language technology — systems that simply sign English word-by-word fail to serve pre-lingually Deaf users whose primary language has entirely different grammar. This lesson applies directly to modern AI-based sign language generation efforts. The DRS-based translation architecture, with its intermediate representation supporting multiple target sign languages, anticipates multilingual sign language translation approaches. For accessibility practitioners, the TESSA system demonstrates both the potential and limitations of avatar-based signing in service delivery contexts — useful for routine transactions with limited vocabulary, but requiring careful evaluation by Deaf users. The project also highlights that sign language accessibility is fundamentally a translation problem, not merely a representation problem, requiring the same sophistication as any cross-linguistic machine translation system.

Tags: sign language · sign language avatar · virtual signing · British Sign Language · machine translation · natural language processing · motion capture · HamNoSys · deaf accessibility · XML

Standards referenced: MPEG-4 · XML · VRML