🤖 AI Summary
In-context learning (ICL), similarity-based exemplar selection often induces topic bias, degrading downstream generalization. To address this, we propose a topic-diversity-aware exemplar re-ranking method that— for the first time—introduces Maximal Marginal Relevance (MMR) into ICL exemplar selection. Our approach explicitly balances relevance to the query against inter-exemplar topic diversity, thereby mitigating topic overload. It integrates dense and sparse embedding retrieval with MMR-based re-ranking and is evaluated under a multi-task zero-shot evaluation framework. Experiments demonstrate consistent and significant improvements in accuracy across classification and reasoning tasks, robust across varying context lengths and similarity metrics. The method is computationally lightweight and does not require model fine-tuning or gradient updates. All code is publicly available.
📝 Abstract
In-Context Learning (ICL) has gained prominence due to its ability to perform tasks without requiring extensive training data and its robustness to noisy labels. A typical ICL workflow involves selecting localized examples relevant to a given input using sparse or dense embedding-based similarity functions. However, relying solely on similarity-based selection may introduce topical biases in the retrieved contexts, potentially leading to suboptimal downstream performance. We posit that reranking the retrieved context to enhance topical diversity can improve downstream task performance. To achieve this, we leverage maximum marginal relevance (MMR) which balances topical similarity with inter-example diversity. Our experimental results demonstrate that diversifying the selected examples leads to consistent improvements in downstream performance across various context sizes and similarity functions. The implementation of our approach is made available at https://github.com/janak11111/Diverse-ICL.