🤖 AI Summary
This study addresses the challenge of efficient adaptation of decoder-only large language models (LLMs) for zero-shot cross-lingual transfer. We systematically evaluate three parameter-efficient fine-tuning (PEFT) methods—prefix tuning, soft prompt tuning, and Llama Adapter—across 35+ high- and low-resource languages, using Llama and Mistral model families. Experimental results demonstrate that prefix tuning significantly outperforms baselines such as LoRA in low-resource language settings, yielding an average +6% improvement on the Belebele benchmark with only 1.23M trainable parameters—achieving consistent gains across model scales and language families. Our key contribution is the first empirical validation of prefix tuning’s superiority, lightweight nature, and strong scalability for multilingual zero-shot transfer. This establishes a novel, resource-efficient paradigm for cross-lingual LLM adaptation in constrained environments.
📝 Abstract
With the release of new large language models (LLMs) like Llama and Mistral, zero-shot cross-lingual transfer has become increasingly feasible due to their multilingual pretraining and strong generalization capabilities. However, adapting these decoder-only LLMs to new tasks across languages remains challenging. While parameter-efficient fine-tuning (PeFT) techniques like Low-Rank Adaptation (LoRA) are widely used, prefix-based techniques such as soft prompt tuning, prefix tuning, and Llama Adapter are less explored, especially for zero-shot transfer in decoder-only models. We present a comprehensive study of three prefix-based methods for zero-shot cross-lingual transfer from English to 35+ high- and low-resource languages. Our analysis further explores transfer across linguistic families and scripts, as well as the impact of scaling model sizes from 1B to 24B. With Llama 3.1 8B, prefix methods outperform LoRA-baselines by up to 6% on the Belebele benchmark. Similar improvements were observed with Mistral v0.3 7B as well. Despite using only 1.23M learning parameters with prefix tuning, we achieve consistent improvements across diverse benchmarks. These findings highlight the potential of prefix-based techniques as an effective and scalable alternative to LoRA, particularly in low-resource multilingual settings.