🤖 AI Summary
Traditional vector retrieval relies on approximate nearest neighbor (ANN) search, which often yields semantically redundant results and fails to meet the diversity and contextual richness requirements of applications such as retrieval-augmented generation (RAG) and multi-hop question answering. To address this, we propose a novel paradigm—“Semantic Compression and Graph-Enhanced Retrieval”—that introduces submodular optimization into vector retrieval for the first time. We formalize semantic compression to maximize information coverage while explicitly suppressing redundancy. Leveraging information-geometric similarity metrics and k-nearest neighbor (kNN) graphs, we construct a multi-hop semantic search framework, further augmented with knowledge graph integration for structured semantic querying. Our method supports hybrid indexing and significantly improves semantic diversity and coverage in high-dimensional embedding spaces, outperforming state-of-the-art ANN baselines. The implementation is open-sourced, advancing research toward semantics-centric vector search.
📝 Abstract
Vector databases typically rely on approximate nearest neighbor (ANN) search to retrieve the top-k closest vectors to a query in embedding space. While effective, this approach often yields semantically redundant results, missing the diversity and contextual richness required by applications such as retrieval-augmented generation (RAG), multi-hop QA, and memory-augmented agents. We introduce a new retrieval paradigm: semantic compression, which aims to select a compact, representative set of vectors that captures the broader semantic structure around a query. We formalize this objective using principles from submodular optimization and information geometry, and show that it generalizes traditional top-k retrieval by prioritizing coverage and diversity. To operationalize this idea, we propose graph-augmented vector retrieval, which overlays semantic graphs (e.g., kNN or knowledge-based links) atop vector spaces to enable multi-hop, context-aware search. We theoretically analyze the limitations of proximity-based retrieval under high-dimensional concentration and highlight how graph structures can improve semantic coverage. Our work outlines a foundation for meaning-centric vector search systems, emphasizing hybrid indexing, diversity-aware querying, and structured semantic retrieval. We make our implementation publicly available to foster future research in this area.