Automatic In-Domain Exemplar Construction and LLM-Based Refinement of Multi-LLM Expansions for Query Expansion

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

This work proposes a fully automatic, unsupervised domain-adaptive query expansion framework that overcomes the limitations of existing methods—such as reliance on manual prompting, handcrafted example selection, or a single language model—which suffer from poor scalability and weak cross-domain transferability. The approach first constructs an in-domain example pool via BM25-MonoT5 pseudo-relevance feedback and then selects diverse examples using a training-free clustering strategy. It further introduces an innovative ensemble mechanism wherein two large language models (LLMs) collaboratively generate expansions, which are subsequently refined by a third LLM, all without requiring labeled data. Evaluated on TREC DL2020, DBpedia, and SciFact, the method significantly outperforms BM25, Rocchio, zero-shot, and fixed few-shot baselines, demonstrating statistically significant gains, robust performance, and strong generalization across domains.

Technology Category

Application Category

📝 Abstract

Query expansion with large language models is promising but often relies on hand-crafted prompts, manually chosen exemplars, or a single LLM, making it non-scalable and sensitive to domain shift. We present an automated, domain-adaptive QE framework that builds in-domain exemplar pools by harvesting pseudo-relevant passages using a BM25-MonoT5 pipeline. A training-free cluster-based strategy selects diverse demonstrations, yielding strong and stable in-context QE without supervision. To further exploit model complementarity, we introduce a two-LLM ensemble in which two heterogeneous LLMs independently generate expansions and a refinement LLM consolidates them into one coherent expansion. Across TREC DL20, DBPedia, and SciFact, the refined ensemble delivers consistent and statistically significant gains over BM25, Rocchio, zero-shot, and fixed few-shot baselines. The framework offers a reproducible testbed for exemplar selection and multi-LLM generation, and a practical, label-free solution for real-world QE.

Problem

Research questions and friction points this paper is trying to address.

query expansion

large language models

domain shift

exemplar selection

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

query expansion

in-domain exemplar construction

multi-LLM ensemble