🤖 AI Summary
To address the dual challenges of **restricted data sharing** (physical boundary) and **significant linguistic divergence** (linguistic boundary) in fine-tuning large language models for low-resource languages, this paper proposes **Multilingual Federated Prompt Tuning (MFPT)**—the first framework integrating federated learning with parameter-efficient prompt tuning in multilingual settings. MFPT introduces a language-distance-aware dynamic weight aggregation mechanism to jointly enable cross-lingual knowledge transfer and strict local data privacy preservation. It supports fully decentralized training with data remaining within its origin domain, and leverages language distance modeling to foster reciprocal enhancement among low-resource languages. Experiments demonstrate that MFPT achieves an average accuracy improvement of 6.9% on low-resource language tasks, while exhibiting superior data efficiency, training stability, and cross-lingual generalization—effectively overcoming both physical and linguistic boundaries.
📝 Abstract
Pre-trained large language models (LLMs) have become a cornerstone of modern natural language processing, with their capabilities extending across a wide range of applications and languages. However, the fine-tuning of multilingual LLMs, especially for low-resource languages, faces significant challenges arising from data-sharing restrictions (the physical border) and inherent linguistic differences (the linguistic border). These barriers hinder users of various languages, particularly those in low-resource regions, from fully benefiting from the advantages of LLMs. To address these challenges, we propose the Federated Prompt Tuning Paradigm for multilingual scenarios, which utilizes parameter-efficient fine-tuning while adhering to data sharing restrictions. We design a comprehensive set of experiments and analyze them using a novel notion of language distance to highlight the strengths of our paradigm: Even under computational constraints, our method not only improves data efficiency but also facilitates mutual enhancements across languages, particularly benefiting low-resource ones. Compared to traditional local cross-lingual transfer tuning methods, our approach achieves 6.9% higher accuracy with improved data efficiency, and demonstrates greater stability and generalization. These findings underscore the potential of our approach to promote social equality and champion linguistic diversity, ensuring that no language is left behind.