Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Low-resource Indian languages (e.g., Bodo) suffer from poor generalization of reward models and severe scarcity of high-quality preference data. Method: This paper proposes RELIC, a retrieval-augmented in-context learning framework for reward modeling. It introduces a cross-lingual retriever trained on pairwise ranking objectives to select high-resource-language examples that maximally highlight response quality differences; integrates multilingual transfer with fine-tuning of the LLaMA-3.2-3B reward model for zero-shot and few-shot knowledge transfer. Contribution/Results: RELIC achieves significant improvements over state-of-the-art methods on three major benchmarks—including PKU-SafeRLHF—demonstrating robust cross-lingual reward modeling capability. On Bodo, it improves reward accuracy by 12.81% over zero-shot prompting and by 10.13% over the prior best example selection method, validating its efficacy in low-resource settings.

Technology Category

Application Category

📝 Abstract

Reward models are essential for aligning large language models (LLMs) with human preferences. However, most open-source multilingual reward models are primarily trained on preference datasets in high-resource languages, resulting in unreliable reward signals for low-resource Indic languages. Collecting large-scale, high-quality preference data for these languages is prohibitively expensive, making preference-based training approaches impractical. To address this challenge, we propose RELIC, a novel in-context learning framework for reward modeling in low-resource Indic languages. RELIC trains a retriever with a pairwise ranking objective to select in-context examples from auxiliary high-resource languages that most effectively highlight the distinction between preferred and less-preferred responses. Extensive experiments on three preference datasets- PKU-SafeRLHF, WebGPT, and HH-RLHF-using state-of-the-art open-source reward models demonstrate that RELIC significantly improves reward model accuracy for low-resource Indic languages, consistently outperforming existing example selection methods. For example, on Bodo-a low-resource Indic language-using a LLaMA-3.2-3B reward model, RELIC achieves a 12.81% and 10.13% improvement in accuracy over zero-shot prompting and state-of-the-art example selection method, respectively.

Problem

Research questions and friction points this paper is trying to address.

Improving reward model generalization for low-resource Indic languages

Addressing unreliable reward signals due to high-resource language bias

Reducing need for costly preference data collection in low-resource languages

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses in-context learning for reward modeling

Trains retriever with pairwise ranking objective

Selects examples from high-resource languages

🔎 Similar Papers

No similar papers found.