🤖 AI Summary
This work addresses semantic drift in semi-structured query expansion—e.g., “highly rated wildlife photography cameras compatible with Nikon F-mount”—caused by existing methods’ overreliance on lexical similarity while ignoring inter-document relational structure. We propose a relation-aware query expansion framework that jointly leverages knowledge graphs (KGs) and large language models (LLMs). Our approach models document texts as KG nodes and introduces a document-level relational filtering mechanism, shifting beyond entity-centric scoring to jointly encode textual semantics and structural constraints. The method integrates LLM-based prompt engineering, KG construction and pruning, and a multi-stage retrieval fusion architecture. Evaluated on three cross-domain benchmarks, it achieves an average 12.7% improvement in Recall@10 over state-of-the-art methods, demonstrating significant gains in capturing implicit relational intent within complex semi-structured queries.
📝 Abstract
Large language models (LLMs) have been used to generate query expansions augmenting original queries for improving information search. Recent studies also explore providing LLMs with initial retrieval results to generate query expansions more grounded to document corpus. However, these methods mostly focus on enhancing textual similarities between search queries and target documents, overlooking document relations. For queries like"Find me a highly rated camera for wildlife photography compatible with my Nikon F-Mount lenses", existing methods may generate expansions that are semantically similar but structurally unrelated to user intents. To handle such semi-structured queries with both textual and relational requirements, in this paper we propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG). To further address the limitation of entity-based scoring in existing KG-based methods, we leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR). Extensive experiments on three datasets of diverse domains show the advantages of our method compared against state-of-the-art baselines on textual and relational semi-structured retrieval.