🤖 AI Summary
To address the weak zero-shot cross-lingual transfer capability of large language models (LLMs) on non-Latin and low-resource languages—and their heavy reliance on fine-tuning—this paper proposes a query-level dynamic learning framework. During inference, the framework dynamically optimizes three components per input query: the prompt template, the selection of multilingual embedding models, and the LLM dispatch strategy—enabling fine-tuning-free, language-adaptive reasoning. Its core contribution is the first-ever query-driven, multi-component dynamic configuration mechanism, designed for both offline and online deployment. Evaluated on question answering across 18 diverse languages, the method achieves a 10–15% improvement over baseline pre-trained models and a fourfold gain over monolingual fine-tuned models. It significantly enhances zero-shot multilingual generalization while maintaining high inference efficiency.
📝 Abstract
Large language models (LLMs) have revolutionized various domains but still struggle with non-Latin scripts and low-resource languages. This paper addresses the critical challenge of improving multilingual performance without extensive fine-tuning. We introduce a novel dynamic learning approach that optimizes prompt strategy, embedding model, and LLM per query at runtime. By adapting configurations dynamically, our method achieves significant improvements over static, best and random baselines. It operates efficiently in both offline and online settings, generalizing seamlessly across new languages and datasets. Leveraging Retrieval-Augmented Generation (RAG) with state-of-the-art multilingual embeddings, we achieve superior task performance across diverse linguistic contexts. Through systematic investigation and evaluation across 18 diverse languages using popular question-answering (QA) datasets we show our approach results in 10-15% improvements in multilingual performance over pre-trained models and 4x gains compared to fine-tuned, language-specific models.