Exploring the Role of Diversity in Example Selection for In-Context Learning

📅 2025-05-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In-context learning (ICL), similarity-based exemplar selection often induces topic bias, degrading downstream generalization. To address this, we propose a topic-diversity-aware exemplar re-ranking method that— for the first time—introduces Maximal Marginal Relevance (MMR) into ICL exemplar selection. Our approach explicitly balances relevance to the query against inter-exemplar topic diversity, thereby mitigating topic overload. It integrates dense and sparse embedding retrieval with MMR-based re-ranking and is evaluated under a multi-task zero-shot evaluation framework. Experiments demonstrate consistent and significant improvements in accuracy across classification and reasoning tasks, robust across varying context lengths and similarity metrics. The method is computationally lightweight and does not require model fine-tuning or gradient updates. All code is publicly available.

Technology Category

Application Category

📝 Abstract
In-Context Learning (ICL) has gained prominence due to its ability to perform tasks without requiring extensive training data and its robustness to noisy labels. A typical ICL workflow involves selecting localized examples relevant to a given input using sparse or dense embedding-based similarity functions. However, relying solely on similarity-based selection may introduce topical biases in the retrieved contexts, potentially leading to suboptimal downstream performance. We posit that reranking the retrieved context to enhance topical diversity can improve downstream task performance. To achieve this, we leverage maximum marginal relevance (MMR) which balances topical similarity with inter-example diversity. Our experimental results demonstrate that diversifying the selected examples leads to consistent improvements in downstream performance across various context sizes and similarity functions. The implementation of our approach is made available at https://github.com/janak11111/Diverse-ICL.
Problem

Research questions and friction points this paper is trying to address.

Addressing topical biases in example selection for ICL
Improving downstream performance via diverse example reranking
Balancing similarity and diversity using MMR in ICL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses maximum marginal relevance for diversity
Balances similarity and inter-example diversity
Improves downstream task performance consistently
🔎 Similar Papers
No similar papers found.
J
Janak Kapuriya
Data Science Institute, University of Galway, Galway, Ireland
M
Manit Kaushik
Computer Science and Engineering, IIIT Delhi, New Delhi, India
Debasis Ganguly
Debasis Ganguly
Asst. Professor, University of Glasgow
Information RetrievalExplainabilityRAGFairness
Sumit Bhatia
Sumit Bhatia
Senior ML Scientist, Adobe Inc.
Information RetrievalLarge Language ModelsKnowledge Graphs