CubeGraph: Efficient Retrieval-Augmented Generation for Spatial and Temporal Data

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current retrieval-augmented generation systems suffer from vector space fragmentation, broken graph connectivity, and high query overhead when handling hybrid queries involving high-dimensional vectors and spatiotemporal constraints, primarily due to their decoupled architectures. This work proposes CubeGraph, a novel indexing framework that hierarchically partitions the spatial domain into grid cells and constructs modular vector graphs within each cell. During query processing, CubeGraph dynamically stitches together adjacent cells intersecting the spatial filter and performs a unified nearest neighbor search in a single traversal. To our knowledge, this is the first approach to natively integrate vector retrieval with arbitrary spatial constraints, restoring global graph connectivity at runtime through on-the-fly graph stitching and eliminating the overhead of invoking multiple sub-indices. Experiments on real-world datasets demonstrate that CubeGraph significantly outperforms existing methods, achieving breakthroughs in query performance, scalability, and support for complex hybrid queries.
📝 Abstract
Hybrid queries combining high-dimensional vector similarity search with spatio-temporal filters are increasingly critical for modern retrieval-augmented generation (RAG) systems. Existing systems typically handle these workloads by nesting vector indices within low-dimensional spatial structures, such as R-trees. However, this decoupled architecture fragments the vector space, forcing the query engine to invoke multiple disjoint sub-indices per query. This fragmentation destroys graph routing connectivity, incurs severe traversal overhead, and struggles to optimize for complex spatial boundaries. In this paper, we propose CubeGraph, a novel indexing framework designed to natively integrate vector search with arbitrary spatial constraints. CubeGraph partitions the spatial domain using a hierarchical grid, maintaining modular vector graphs within each cell. During query execution, CubeGraph dynamically stitches together adjacent cube-level indices on the fly whenever their spatial cells intersect with the query filter. This dynamic graph integration restores global connectivity, enabling a unified, single-pass nearest-neighbor traversal that eliminates the overhead of fragmented sub-index invocations. Extensive evaluations on real-world datasets demonstrate that CubeGraph significantly outperforms state-of-the-art baselines, offering superior query execution performance, scalability, and flexibility for complex hybrid workloads.
Problem

Research questions and friction points this paper is trying to address.

retrieval-augmented generation
vector similarity search
spatio-temporal queries
hybrid queries
index fragmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-augmented generation
spatio-temporal indexing
vector similarity search
graph-based retrieval
hierarchical grid partitioning
🔎 Similar Papers
No similar papers found.