Influence Guided Sampling for Domain Adaptation of Text Retrievers

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of data sampling from heterogeneous corpora and tasks in training general-purpose dense text retrieval models. The authors propose Inf-DDS, a novel framework that, for the first time, integrates an influence-function-based reward mechanism into retrieval model training. By leveraging reinforcement learning, Inf-DDS dynamically optimizes sampling weights across data sources, using influence signals computed on a target development set as rewards—enabling fully unsupervised, adaptive sampling without manual supervision. Evaluated on bge-m3 and all-MiniLM-L6-v2, the method achieves absolute NDCG@10 improvements of 5.03 and 0.94, respectively, while reducing GPU compute costs by 1.5× to 4×, demonstrating both high efficiency and lightweight design.

Technology Category

Application Category

📝 Abstract
General-purpose open-domain dense retrieval systems are usually trained with a large, eclectic mix of corpora and search tasks. How should these diverse corpora and tasks be sampled for training? Conventional approaches sample them uniformly, proportional to their instance population sizes, or depend on human-level expert supervision. It is well known that the training data sampling strategy can greatly impact model performance. However, how to find the optimal strategy has not been adequately studied in the context of embedding models. We propose Inf-DDS, a novel reinforcement learning driven sampling framework that adaptively reweighs training datasets guided by influence-based reward signals and is much more lightweight with respect to GPU consumption. Our technique iteratively refines the sampling policy, prioritizing datasets that maximize model performance on a target development set. We evaluate the efficacy of our sampling strategy on a wide range of text retrieval tasks, demonstrating strong improvements in retrieval performance and better adaptation compared to existing gradient-based sampling methods, while also being 1.5x to 4x cheaper in GPU compute. Our sampling strategy achieves a 5.03 absolute NDCG@10 improvement while training a multilingual bge-m3 model and an absolute NDCG@10 improvement of 0.94 while training all-MiniLM-L6-v2, even when starting from expert-assigned weights on a large pool of training datasets.
Problem

Research questions and friction points this paper is trying to address.

domain adaptation
text retrieval
training data sampling
dense retrieval
influence-based sampling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Influence-based Sampling
Domain Adaptation
Dense Retrieval
Reinforcement Learning
Efficient Training
🔎 Similar Papers
No similar papers found.
M
Meet Doshi
IBM Research AI
V
Vishwajeet Kumar
Y
Yulong Li
Jaydeep Sen
Jaydeep Sen
IBM Research AI
Question AnsweringInformation RetrievalNLP