The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance degradation of large language models (LLMs) in generating low- and mid-resource languages. It proposes and empirically validates the “Translation Barrier Hypothesis”: LLMs implicitly adopt a two-stage process—task solving followed by conceptual translation—and translation failure, especially for low-resource languages, is the primary cause of poor generation quality. Leveraging the Logit Lens interpretability method, the study conducts word-level translation process analysis across 108 language pairs, providing the first empirical evidence that a substantial fraction of errors arise from translation failure *after* correct task solving. The findings identify a critical bottleneck in end-to-end multilingual generation and, crucially, uncover a mechanistic explanation—rather than merely statistical correlation—that reveals an actionable intervention point for improving LLMs’ multilingual capabilities.

Technology Category

Application Category

📝 Abstract
Multilingual generation with large language models (LLMs) is often of poor quality for mid- to low-resource languages. Building on insights from interpretability, we demonstrate the existence of an implicit task-solving-->translation pipeline for generation, whereby the model first solves the required task in a largely target-language-agnostic manner, and subsequently translates answer concepts into the intended target language. We hypothesize that the failure of the translation stage is an important culprit for the observed low quality of final outputs, and formalize this as the translation barrier hypothesis. We test this hypothesis for a word translation task across 108 language pairs, using logit lens to observe model processing in intermediate layers. We find that a significant portion of overall failures indeed stems from translation failure, or the model's inability to translate correctly solved intermediate concepts into the target language. This is especially true for low-resource target languages. Our results highlight an important hurdle for end-to-end multilingual generation, and lend guiding insights for future work seeking to improve multilinguality in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Multilingual generation quality is poor for mid-low resource languages
Implicit translation failure causes low-quality outputs in LLMs
Translation barrier hypothesis explains multilingual generation challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit task-solving to translation pipeline
Logit lens for intermediate layer observation
Translation barrier hypothesis formalization
🔎 Similar Papers
No similar papers found.