← All reviews

A Web Based Multi-Linguists Symbol-to-Text AAC Application

Chaohai Ding, Nawar Halabi, Lama Al-Zaben, Yunjia Li, E. A. Draffan, Mike Wald · 2015 · Proceedings of the 12th International Web for All Conference (W4A) · doi:10.1145/2745555.2746674

Summary

This paper presents Symbol Dragoman, a web-based augmentative and alternative communication (AAC) application that enables users who have no spoken language to communicate in both Arabic and English using pictographic symbols. The core problem addressed is that existing AAC symbol sets are typically tied to a single language — each symbol maps to a word or phrase in one language, making multilingual communication extremely difficult. Symbol Dragoman allows users to select pictographic symbols (drawn from the ARASAAC symbol dictionary) in any order to construct a message, then generates grammatically well-formed sentences in both English and Arabic that can be read on screen or spoken aloud via text-to-speech. The application handles right-to-left layout for Arabic and left-to-right for English, and uses a responsive design for mobile and tablet browsers. The system builds on the Arabic Symbol Dictionary project at the University of Southampton, addressing the specific challenges of AAC across languages with fundamentally different morphological structures.

Key findings

The paper describes two distinct approaches for sentence generation in each language, driven by the significant linguistic differences between English and Arabic. For English, the system uses keyword-based search against a large indexed sentence corpus via Elasticsearch — each symbol maps to one or more keywords from the ARASAAC dictionary, and the system searches for sentences containing those keywords regardless of symbol input order. This corpus-based approach works well for English due to the availability of millions of sentences. For Arabic, the same approach failed because the available parallel Arabic corpus was too small (~9,000 sentences). Instead, the researchers developed a morphological approach: manually generating verb conjugations across all subject pronouns (first/second/third person, masculine/feminine, singular/plural), four adjectival morphs per keyword, and leaving nouns unaffixed for the demo. Arabic presents particular complexity because pronouns are typically realised as word affixes rather than separate words — so a single concept like "my apple" is a single affixed Arabic word rather than two separate words. The paper also reviews prior approaches to symbol-to-text translation: predictive systems for Blissymbolic-to-English (assuming ordered input following English syntax), non-linear construction areas arranged by semantic role, and SymbolPath's semantic frames for predicting intended symbols from a continuous motion path.

Relevance

This paper highlights an important and underserved area: multilingual AAC. Most AAC tools and research are English-centric, yet AAC users exist in every language community. The Arabic focus is particularly valuable given the relative scarcity of Arabic-language AAC resources and the significant morphological challenges Arabic poses for symbol-to-text conversion (agglutinative affixing, right-to-left script, complex verb conjugation). For practitioners, the key insight is that symbol-to-text translation cannot simply be "translated" from one language to another — fundamentally different linguistic architectures require fundamentally different computational approaches. The web-based, responsive design makes the tool accessible without requiring specialised hardware or app installation. However, this is a short demonstration paper with acknowledged limitations: sentence generation accuracy needs improvement, symbol input speed is limited, and semantic relations between symbols are not yet modelled. The morphological handling for Arabic nouns was left incomplete, and verb conjugations were generated manually rather than automatically. Future work involving personalisation and machine learning could address these gaps.

Tags: AAC · augmentative and alternative communication · symbol communication · multilingual accessibility · Arabic · natural language generation · pictograms · web accessibility · text-to-speech