Pattern recognition and synthesis for sign language translation system

M. Ohki, H. Sagawa, T. Sakiyama, E. Oohira, H. Ikeda, H. Fujisawa · 1994 · Proceedings of the First Annual ACM Conference on Assistive Technologies (Assets '94) · doi:10.1145/191028.191030

Summary

This paper describes a bidirectional sign language translation system being developed to translate between Japanese Sign Language (JSL) and written Japanese. The system addresses both directions of translation: recognizing sign language input and converting it to Japanese text, and converting Japanese text into sign language represented as 3D computer graphic animation. For sign language recognition, the system uses a DataGlove — a wearable input device that captures hand shape and position data — to record the user's hand gestures. The captured hand motion data is then processed through pattern recognition algorithms to identify individual signs and translate sequences into Japanese sentences. For the reverse direction, Japanese text input is analyzed and translated into a sequence of sign language gestures, which are rendered as 3D computer graphic animations showing a virtual signer performing the signs. The research represents an early and ambitious attempt at fully automated, bidirectional sign language translation at a time when both gesture recognition and computer graphics animation were in their infancy.

Key findings

The system demonstrates the feasibility of bidirectional translation between a signed language and a written language using 1994-era technology. The DataGlove input successfully captures sufficient hand shape and position information for the pattern recognition component to identify signs, though the system's vocabulary and recognition accuracy would have been limited by the technology of the time. The 3D computer graphic animation output represents an early approach to sign language synthesis that would become a significant research area in subsequent decades. The bidirectional design is notable — most contemporary systems attempted only one direction of translation. The work highlights the fundamental challenge that sign language is not simply a gestural encoding of spoken language but has its own grammar and syntax, meaning that translation between JSL and Japanese requires linguistic processing beyond simple word-for-word substitution.

Relevance

This paper is an early entry in what has become one of the most active research areas in accessibility technology: automated sign language recognition and generation. The challenges identified here — capturing the full range of manual and non-manual features of sign language, performing real-time pattern recognition on continuous signing, and generating natural-looking sign language animation — remain central research problems three decades later. Modern approaches have replaced the DataGlove with computer vision and depth cameras, and 3D avatar animation has advanced dramatically, but the fundamental system architecture of input capture, pattern recognition, linguistic processing, and animated output remains remarkably similar. The focus on Japanese Sign Language is significant as it highlights the global nature of sign language accessibility research and the fact that each national sign language requires its own recognition and synthesis models.

Tags: sign language recognition · sign language synthesis · Japanese Sign Language · gesture recognition · DataGlove · 3D animation · computer graphics · machine translation · deaf and hard of hearing · pattern recognition · bidirectional translation