🤖 AI Summary
This work addresses the limited multilingual capabilities of large language models (LLMs), particularly their poor performance on low-resource and unseen languages. The authors propose XBridge, a novel architecture that extends LLMs’ multilingual capacity without retraining them. XBridge employs a pre-trained translation model to handle multilingual input and output, while preserving the LLM as an English-centric knowledge core. To mitigate representation misalignment between the translation model and the LLM, the framework introduces a lightweight cross-model mapping layer and a semantic alignment mechanism based on optimal transport. Experimental results demonstrate that XBridge significantly outperforms strong baselines across four prominent LLMs, achieving notable gains in comprehension, reasoning, summarization, and generation tasks—especially for low-resource and previously unseen languages.
📝 Abstract
Large language models (LLMs) exhibit strong general intelligence, yet their multilingual performance remains highly imbalanced. Although LLMs encode substantial cross-lingual knowledge in a unified semantic space, they often struggle to reliably interface this knowledge with low-resource or unseen languages. Fortunately, pretrained encoder-decoder translation models already possess balanced multilingual capability, suggesting a natural complement to LLMs. In this work, we propose XBridge, a compositional encoder-LLM-decoder architecture that offloads multilingual understanding and generation to external pretrained translation models, while preserving the LLM as an English-centric core for general knowledge processing. To address the resulting representation misalignment across models, we introduce lightweight cross-model mapping layers and an optimal transport-based alignment objective, enabling fine-grained semantic consistency for multilingual generation. Experiments on four LLMs across multilingual understanding, reasoning, summarization, and generation indicate that XBridge outperforms strong baselines, especially on low-resource and previously unseen languages, without retraining the LLM.