Transformer-Encoder Trees for Efficient Multilingual Machine Translation and Speech Translation

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address computational redundancy and poor translation quality for low-resource languages in multilingual machine translation (MT) and speech translation (ST), this paper proposes a hierarchical Transformer encoder tree. Built upon linguistic similarity, the architecture shares intermediate representations across languages and enables single-pass generation of multiple target-language translations, facilitating cross-lingual knowledge transfer and parameter-efficient sharing. We innovatively integrate this encoder tree into a non-autoregressive ST framework, combining a CTC-trained encoder-only Transformer, a wav2vec 2.0 speech encoder, and hierarchical parameter sharing. Experiments demonstrate that our approach achieves translation quality on par with autoregressive models on multilingual MT and ST benchmarks, while accelerating inference by 7–14× and substantially reducing computational cost.

Technology Category

Application Category

📝 Abstract
Multilingual translation faces challenges of computational redundancy and limited accuracy for low-resource languages, especially in speech translation. To address this, we propose a novel hierarchical Transformer Encoder Tree (TET) combined with non-autoregressive encoder-only models trained with Connectionist Temporal Classification for multilingual translation. By sharing intermediate representations among linguistically similar target languages, TET can improve accuracy on low-resource languages, reduce computational redundancy, and allow generating all target languages in a single forward pass, thus eliminating sequential bottlenecks and improving parallelism. For speech translation, combining TET with a non-autoregressive speech recognition backbone (wav2vec2) shows promising results in terms of translation quality compared to autoregressive systems while being 7-14 times faster.
Problem

Research questions and friction points this paper is trying to address.

Addresses computational redundancy and accuracy limitations in multilingual machine translation
Improves translation for low-resource languages through shared linguistic representations
Eliminates sequential bottlenecks in speech translation while maintaining quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Transformer Encoder Tree for multilingual translation
Non-autoregressive encoder-only models with CTC training
Shared intermediate representations among similar languages
🔎 Similar Papers
No similar papers found.
Y
Yiwen Guan
Worcester Polytechnic Institute, Worcester, MA, USA
Jacob Whitehill
Jacob Whitehill
Worcester Polytechnic Institute
Artificial Intelligence