PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks

📅 2024-12-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

In low-resource Indian languages, few-shot in-context learning (ICL) suffers from severe performance degradation due to the scarcity of high-quality, task-relevant demonstrations. Method: We propose a cross-lingual example transfer framework that leverages demonstration corpora from high-resource related languages. It employs a multi-task-aligned cross-lingual retriever and incorporates a semantic diversity constraint to mitigate retrieval bias. We further introduce the first alternating minimization algorithm for demonstration selection, jointly optimizing retrieval quality and task-specific adaptability. Contribution/Results: The method is parameter-efficient—requiring no fine-tuning—and is compatible with mainstream open-weight LLMs (e.g., LLaMA-3.1-8B, Qwen-2.5-7B). Evaluated across four generative tasks, it achieves an average improvement of 12.7% in BLEU and ROUGE scores over state-of-the-art cross-lingual ICL approaches.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL). However, ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge. This issue is further amplified in low-resource Indic languages, where the scarcity of ground-truth data complicates the selection process. In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages. PromptRefine leverages auxiliary example banks from related high-resource Indic languages and employs multi-task learning techniques to align language-specific retrievers, enabling effective cross-language retrieval. Additionally, we incorporate diversity in the selected examples to enhance generalization and reduce bias. Through comprehensive evaluations on four text generation tasks -- Cross-Lingual Question Answering, Multilingual Question Answering, Machine Translation, and Cross-Lingual Summarization using state-of-the-art LLMs such as LLAMA-3.1-8B, LLAMA-2-7B, Qwen-2-7B, and Qwen-2.5-7B, we demonstrate that PromptRefine significantly outperforms existing frameworks for retrieving examples.

Problem

Research questions and friction points this paper is trying to address.

Improving few-shot learning for low-resource Indic languages

Selecting optimal examples from related high-resource languages

Enhancing cross-language retrieval with diversity and multi-task learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Alternating Minimization for example selection

Multi-task learning for cross-language retrieval

Diverse example selection to reduce bias

🔎 Similar Papers

No similar papers found.