Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

๐Ÿ“… 2026-03-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limited instruction-following capability of large language models in low-resource languages and the high computational cost and data demands of conventional adaptation methods. It presents the first systematic investigation into the effectiveness of model weight merging for cross-lingual instruction transfer, directly fusing a general-purpose instruction-tuned model with a target-language base modelโ€”without requiring language-specific instruction data or repeated fine-tuning. Evaluated across four Iberian languages and two dominant architectures, the approach achieves instruction-following performance on par with traditional methods while substantially reducing computational overhead, thereby demonstrating the efficiency, generality, and scalability of weight merging as a viable strategy for low-resource multilingual adaptation.
๐Ÿ“ Abstract
Large Language Models (LLMs) remain heavily centered on English, with limited performance in low-resource languages. Existing adaptation approaches, such as continual pre-training, demand significant computational resources. In the case of instructed models, high-quality instruction data is also required, both of which are often inaccessible for low-resource language communities. Under these constraints, model merging offers a lightweight alternative, but its potential in low-resource contexts has not been systematically explored. In this work, we explore whether it is possible to transfer language knowledge to an instruction-tuned LLM by merging it with a language-specific base model, thereby eliminating the need of language-specific instructions and repeated fine-tuning processes whenever stronger instructed variants become available. Through experiments covering four Iberian languages (Basque, Catalan, Galician, and Spanish) and two model families, we show that merging enables effective instruction following behavior in new languages and even supports multilingual capability through the combination of multiple language-specific models. Our results indicate that model merging is a viable and efficient alternative to traditional adaptation methods for low-resource languages, achieving competitive performance while greatly reducing computational cost.
Problem

Research questions and friction points this paper is trying to address.

low-resource languages
instruction-tuned LLMs
model adaptation
computational efficiency
multilingual capability
Innovation

Methods, ideas, or system contributions that make the work stand out.

model merging
low-resource languages
instruction tuning
multilingual LLMs
parameter-efficient adaptation
๐Ÿ”Ž Similar Papers
No similar papers found.
E
Eneko Valero
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
M
Maria Ribalta i Albado
HiTZ Center - Ixa, University of the Basque Country UPV/EHU
Oscar Sainz
Oscar Sainz
University of the Basque Country (UPV/EHU)
Computer ScienceArtificial InteligenceNatural Language ProcessingInformation Extraction
Naiara Perez
Naiara Perez
NLP Researcher at IXA, HiTZ Center, University of the Basque Country
Natural Language Processing
G
German Rigau
HiTZ Center - Ixa, University of the Basque Country UPV/EHU