🤖 AI Summary
To address the low relevance of dense retrievers on complex queries involving negative constraints—caused by their neglect of semantic intent—this paper proposes Neural-Symbolic Integrated Retrieval (NS-IR). NS-IR is the first dense retrieval framework to incorporate First-Order Logic (FOL) modeling, enabling fine-grained re-ranking via logic consistency discrimination, logic alignment, and connective-aware constraint enforcement, while jointly optimizing logic-driven embedding representations. We introduce NegConstraint, the first benchmark dataset specifically designed for queries with negative constraints. Experiments demonstrate that NS-IR significantly outperforms state-of-the-art methods on negative-constraint queries; it also achieves superior performance in zero-shot web search and low-resource settings. The code and dataset are publicly released.
📝 Abstract
Information retrieval plays a crucial role in resource localization. Current dense retrievers retrieve the relevant documents within a corpus via embedding similarities, which compute similarities between dense vectors mainly depending on word co-occurrence between queries and documents, but overlook the real query intents. Thus, they often retrieve numerous irrelevant documents. Particularly in the scenarios of complex queries such as emph{negative-constraint queries}, their retrieval performance could be catastrophic. To address the issue, we propose a neuro-symbolic information retrieval method, namely extbf{NS-IR}, that leverages first-order logic (FOL) to optimize the embeddings of naive natural language by considering the emph{logical consistency} between queries and documents. Specifically, we introduce two novel techniques, emph{logic alignment} and emph{connective constraint}, to rerank candidate documents, thereby enhancing retrieval relevance. Furthermore, we construct a new dataset extbf{NegConstraint} including negative-constraint queries to evaluate our NS-IR's performance on such complex IR scenarios. Our extensive experiments demonstrate that NS-IR not only achieves superior zero-shot retrieval performance on web search and low-resource retrieval tasks, but also performs better on negative-constraint queries. Our scource code and dataset are available at https://github.com/xgl-git/NS-IR-main.