Direct Translation between Sign Languages

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the challenge of enabling direct cross-lingual communication among the global 1.5 billion deaf and hard-of-hearing individuals, a task hindered by existing sign language translation systems that rely on spoken or textual intermediaries. The paper proposes the first end-to-end sign-to-sign translation framework that operates without intermediate text representations. It leverages back-translation to synthesize parallel corpora from unaligned monolingual sign language data and integrates these within a multi-language pretraining architecture based on mBART. Evaluation employs dynamic time warping (DTW) alongside geometric and linguistic metrics. Experiments across American Sign Language (ASL), Chinese Sign Language (CSL), and German Sign Language (DGS) demonstrate substantial improvements over cascaded baselines: a 20% reduction in geometric error, a 50% increase in back-translation BLEU-4 scores, and a 2.3× speedup in inference, effectively mitigating data scarcity while preserving visual modality characteristics.

📝 Abstract

The field of sign language translation has witnessed significant progress in the translation between sign and spoken languages, but the translation between sign languages remains largely unexplored and out of reach. The latter can help 1.5 billion deaf and hard-of-hearing (DHH) people worldwide communicate across language barriers without relying on hearing interpreters or written-language fluency. The cascade approach composing separate sign-to-text, text-to-text, and text-to-sign systems suffers from error propagation and extra latency as well as the loss of information unique in the visual modality. We aim to develop direct sign-to-sign translation. However, a large-scale open-domain parallel corpus has not been curated between sign languages. To enable direct translation between sign language utterances, we use back-translation to produce synthetic sign-sign pairs from unaligned individual language utterance-sign corpora. Using this data, we jointly train a single MBART-based model for both text->sign (T2S) and sign->sign (S2S). On synthetically generated paired sets between American Sign Language (ASL), Chinese Sign Language (CSL), and German Sign Language (DGS), our direct S2S method outperforms the cascaded baseline on geometric sign error metrics (20% lower DTW-aligned MPJPE) and language matching metrics after predicted sign utterances are translated back to sentences (50% high BLEU-4) while achieving a roughly 2.3* speedup. On a small set of pre-existing cross-lingual sign data, we find similar improvements for our proposed method.

Problem

Research questions and friction points this paper is trying to address.

sign language translation

direct sign-to-sign translation

cross-lingual communication

deaf and hard-of-hearing

visual modality

Innovation

Methods, ideas, or system contributions that make the work stand out.

sign-to-sign translation

back-translation

MBART-based model