🤖 AI Summary
Contemporary large language models (LLMs) exhibit pronounced multilingual and cultural imbalances due to English-centric pretraining, especially for low-resource languages. To address this, we propose XTransplant, the first inference-time framework that activates latent multilingual and culturally grounded knowledge—already acquired but underutilized—via implicit cross-lingual alignment and transplantation of hidden states. Methodologically, we conduct module-level functional disentanglement analysis, revealing that attention mechanisms primarily govern multilingual comprehension, while feed-forward networks encode culture-specific representations. Experiments demonstrate that XTransplant significantly improves performance on low-resource language tasks and enhances cultural adaptability, confirming cross-lingual mutual gains. Moreover, our findings reveal that current LLMs’ multilingual potential is substantially underestimated; we establish a more realistic upper bound on their practical capabilities and introduce a novel paradigm for efficient knowledge utilization.
📝 Abstract
Current large language models (LLMs) often exhibit imbalances in multilingual capabilities and cultural adaptability, largely attributed to their English-centric pre-training data. In this paper, we introduce and investigate a cross-lingual latent transplantation (XTransplant) framework, which aims to further exploit the model's internalized multilingual knowledge during inference and examine its effects on the multilingual capability and cultural adaptability of LLMs. XTransplant framework enables models to harness the complementary strengths of both English and non-English resources by transplanting latent activations across languages. Through extensive analysis, we empirically demonstrate that XTransplant, a form of cross-lingual interaction, has mutually beneficial effects on the multilingual capability and cultural adaptability of LLMs, particularly for low-resource languages and cultures. We further reveal that attention modules play a pivotal role in supporting multilingual understanding, while feed-forward modules are more adept at capturing culture-specific knowledge. In addition, we conduct in-depth analysis of XTransplant's stability, effectiveness, and generalizability. By probing the upper bound performance of XTransplant, we expose the considerable underutilization of current LLMs' multilingual potential-a challenge that remains open. We hope our analysis offers a new lens for advancing cross-lingual interactions and better leveraging models' internalized multilingual knowledge.