MultiConIR: Towards multi-condition Information Retrieval

๐Ÿ“… 2025-03-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the insufficient robustness of retrieval and reranking models under multi-condition queries, this paper introduces MultiConIRโ€”the first cross-domain benchmark for multi-condition information retrieval, covering books, films & TV, people, healthcare, and law. We formally define the multi-condition retrieval scenario and design three structured evaluation tasks: multi-condition robustness, monotonic relevance ranking, and query format sensitivity. Experiments reveal that mainstream rerankers degrade significantly as the number of conditions increases, whereas GritLM and Nv-Embed demonstrate superior adaptability. Further analysis shows that pooling strategies exhibit notable sensitivity to condition position. We publicly release the dataset, code, and a two-stage evaluation framework, establishing a reproducible benchmark, empirical insights, and foundational resources for condition-aware retrieval modeling.

Technology Category

Application Category

๐Ÿ“ Abstract
In this paper, we introduce MultiConIR, the first benchmark designed to evaluate retrieval models in multi-condition scenarios. Unlike existing datasets that primarily focus on single-condition queries from search engines, MultiConIR captures real-world complexity by incorporating five diverse domains: books, movies, people, medical cases, and legal documents. We propose three tasks to systematically assess retrieval and reranking models on multi-condition robustness, monotonic relevance ranking, and query format sensitivity. Our findings reveal that existing retrieval and reranking models struggle with multi-condition retrieval, with rerankers suffering severe performance degradation as query complexity increases. We further investigate the performance gap between retrieval and reranking models, exploring potential reasons for these discrepancies, and analysis the impact of different pooling strategies on condition placement sensitivity. Finally, we highlight the strengths of GritLM and Nv-Embed, which demonstrate enhanced adaptability to multi-condition queries, offering insights for future retrieval models. The code and datasets are available at https://github.com/EIT-NLP/MultiConIR.
Problem

Research questions and friction points this paper is trying to address.

Evaluate retrieval models in multi-condition scenarios
Assess robustness, relevance ranking, and query sensitivity
Investigate performance gaps between retrieval and reranking models
Innovation

Methods, ideas, or system contributions that make the work stand out.

MultiConIR benchmark for multi-condition retrieval
Tasks assess robustness, ranking, and query sensitivity
GritLM and Nv-Embed adapt better to complex queries
๐Ÿ”Ž Similar Papers
No similar papers found.
Xuan Lu
Xuan Lu
Assistant Professor, University of Arizona
Human-centered Data ScienceHuman-AI CollaborationCausal InferenceFuture of WorkEmoji
Sifan Liu
Sifan Liu
Duke University
B
Bochao Yin
Ningbo Key Laboratory of Spatial Intelligence and Digital Derivative, Institute of Digital Twin, EIT
Y
Yongqi Li
Department of Computing, The Hong Kong Polytechnic University
X
Xinghao Chen
Ningbo Key Laboratory of Spatial Intelligence and Digital Derivative, Institute of Digital Twin, EIT
H
Hui Su
Meituan Inc.
Yaohui Jin
Yaohui Jin
Shanghai Jiao Tong University
W
Wenjun Zeng
Ningbo Key Laboratory of Spatial Intelligence and Digital Derivative, Institute of Digital Twin, EIT
Xiaoyu Shen
Xiaoyu Shen
Eastern Institute of Technology, Ningbo
language modelmulti-modal learningreasoning