Cross-Lingual Optimization for Language Transfer in Large Language Models

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the dual challenges of English capability degradation and scarcity of supervised data for low-resource languages in cross-lingual adaptation of large language models (LLMs), this paper proposes a target-language annotation-free cross-lingual optimization method. Our approach leverages only English supervised fine-tuning (SFT) data and a general-purpose machine translation model, employing a multi-stage parameter alignment strategy to achieve efficient cross-lingual transfer—enhancing target-language performance while preserving English capabilities. The method substantially reduces reliance on target-language data: it outperforms standard SFT across six languages spanning diverse resource levels. Notably, in low-resource settings, using merely 3,200 English samples surpasses SFT trained on 6,400 target-language samples, demonstrating strong robustness and generalization.

Technology Category

Application Category

📝 Abstract

Adapting large language models to other languages typically employs supervised fine-tuning (SFT) as a standard approach. However, it often suffers from an overemphasis on English performance, a phenomenon that is especially pronounced in data-constrained environments. To overcome these challenges, we propose extbf{Cross-Lingual Optimization (CLO)} that efficiently transfers an English-centric LLM to a target language while preserving its English capabilities. CLO utilizes publicly available English SFT data and a translation model to enable cross-lingual transfer. We conduct experiments using five models on six languages, each possessing varying levels of resource. Our results show that CLO consistently outperforms SFT in both acquiring target language proficiency and maintaining English performance. Remarkably, in low-resource languages, CLO with only 3,200 samples surpasses SFT with 6,400 samples, demonstrating that CLO can achieve better performance with less data. Furthermore, we find that SFT is particularly sensitive to data quantity in medium and low-resource languages, whereas CLO remains robust. Our comprehensive analysis emphasizes the limitations of SFT and incorporates additional training strategies in CLO to enhance efficiency.

Problem

Research questions and friction points this paper is trying to address.

Overemphasis on English in multilingual model adaptation

Data inefficiency in low-resource language fine-tuning

Balancing target language and English capability retention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Lingual Optimization transfers English LLMs efficiently

Uses English SFT data and translation model

Outperforms SFT with less data in low-resource languages

🔎 Similar Papers

No similar papers found.