Bridging the Language Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs

📅 2023-05-28
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak zero-shot cross-lingual transfer capability of large language models (LLMs) on non-Latin and low-resource languages—and their heavy reliance on fine-tuning—this paper proposes a query-level dynamic learning framework. During inference, the framework dynamically optimizes three components per input query: the prompt template, the selection of multilingual embedding models, and the LLM dispatch strategy—enabling fine-tuning-free, language-adaptive reasoning. Its core contribution is the first-ever query-driven, multi-component dynamic configuration mechanism, designed for both offline and online deployment. Evaluated on question answering across 18 diverse languages, the method achieves a 10–15% improvement over baseline pre-trained models and a fourfold gain over monolingual fine-tuned models. It significantly enhances zero-shot multilingual generalization while maintaining high inference efficiency.
📝 Abstract
Large language models (LLMs) have revolutionized various domains but still struggle with non-Latin scripts and low-resource languages. This paper addresses the critical challenge of improving multilingual performance without extensive fine-tuning. We introduce a novel dynamic learning approach that optimizes prompt strategy, embedding model, and LLM per query at runtime. By adapting configurations dynamically, our method achieves significant improvements over static, best and random baselines. It operates efficiently in both offline and online settings, generalizing seamlessly across new languages and datasets. Leveraging Retrieval-Augmented Generation (RAG) with state-of-the-art multilingual embeddings, we achieve superior task performance across diverse linguistic contexts. Through systematic investigation and evaluation across 18 diverse languages using popular question-answering (QA) datasets we show our approach results in 10-15% improvements in multilingual performance over pre-trained models and 4x gains compared to fine-tuned, language-specific models.
Problem

Research questions and friction points this paper is trying to address.

Multilingual Language Models
Transfer Learning
Zero-Shot Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

RAG Technology
Multilingual Model Adaptation
Dynamic Adjustment Strategy
🔎 Similar Papers
A
A. Nambi
Microsoft Research India
V
Vaibhav Balloli
Microsoft Research India
M
M. Ranjit
Microsoft Research India
T
T. Ganu
Microsoft Research India
Kabir Ahuja
Kabir Ahuja
University of Washington
Natural Language ProcessingMachine Learning
Sunayana Sitaram
Sunayana Sitaram
Microsoft Research India
Multilingual NLPevaluationLLMs and culturemultilingualismLLMs
K
Kalika Bali
Microsoft Research India