🤖 AI Summary
This work addresses the challenge of efficiently supporting multidimensional range and nearest-neighbor queries in memory by proposing a novel sliced k-d tree (skd-tree). The structure partitions subtree spaces into multiple slices along a single dimension and integrates compressed split points, B+-tree–style node organization, and a top-down balanced construction strategy to substantially reduce tree height and memory overhead. Furthermore, it leverages constant-time SIMD instructions to accelerate query processing and introduces an efficient update algorithm. Experimental results on both real-world and synthetic datasets demonstrate that the skd-tree significantly outperforms existing methods in terms of query performance and update efficiency.
📝 Abstract
In this paper, we revisit the problem of indexing multi-dimensional data in memory for the efficient support of multi-dimensional range queries and nearest neighbor queries. This is a classic problem in main-memory databases, where there is a need for indexing multiple columns simultaneously. Established data structures include the R-tree, kd-tree, quad-tree, and grid-based partitioning. More recently, multi-dimensional learned indexes have also been proposed to address this problem. We propose slicing kd-tree (skd-tree), a variant of the kd-tree, where each node partitions the space of its subtree into multiple slices across a single splitting dimension. By compressing the splitters of the partitions and with the help of data-parallelism, we (i) radically reduce the number of levels of the tree and (ii) limit the number of computations required for multi-dimensional range and proximity queries. The nodes of the skd-tree resemble the nodes of a main-memory B+-tree, however, a different dimension is used at each level. Our novel range and kNN algorithms on the skd-tree apply only a small constant number of SIMD instructions at each node during tree traversal. Our contributions also include a novel top-down construction algorithm, different types of inner and leaf nodes that warrant tree balancing, and a novel update algorithm. Our skd-tree achieves strong performance compared to existing methods, according to our experimental evaluation on real and synthetic datasets.