In-memory Multidimensional Indexing Using the skd-tree

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the challenge of efficiently supporting multidimensional range and nearest-neighbor queries in memory by proposing a novel sliced k-d tree (skd-tree). The structure partitions subtree spaces into multiple slices along a single dimension and integrates compressed split points, B+-tree–style node organization, and a top-down balanced construction strategy to substantially reduce tree height and memory overhead. Furthermore, it leverages constant-time SIMD instructions to accelerate query processing and introduces an efficient update algorithm. Experimental results on both real-world and synthetic datasets demonstrate that the skd-tree significantly outperforms existing methods in terms of query performance and update efficiency.

📝 Abstract

In this paper, we revisit the problem of indexing multi-dimensional data in memory for the efficient support of multi-dimensional range queries and nearest neighbor queries. This is a classic problem in main-memory databases, where there is a need for indexing multiple columns simultaneously. Established data structures include the R-tree, kd-tree, quad-tree, and grid-based partitioning. More recently, multi-dimensional learned indexes have also been proposed to address this problem. We propose slicing kd-tree (skd-tree), a variant of the kd-tree, where each node partitions the space of its subtree into multiple slices across a single splitting dimension. By compressing the splitters of the partitions and with the help of data-parallelism, we (i) radically reduce the number of levels of the tree and (ii) limit the number of computations required for multi-dimensional range and proximity queries. The nodes of the skd-tree resemble the nodes of a main-memory B+-tree, however, a different dimension is used at each level. Our novel range and kNN algorithms on the skd-tree apply only a small constant number of SIMD instructions at each node during tree traversal. Our contributions also include a novel top-down construction algorithm, different types of inner and leaf nodes that warrant tree balancing, and a novel update algorithm. Our skd-tree achieves strong performance compared to existing methods, according to our experimental evaluation on real and synthetic datasets.

Problem

Research questions and friction points this paper is trying to address.

in-memory indexing

multi-dimensional data

range queries

nearest neighbor queries

main-memory databases

Innovation

Methods, ideas, or system contributions that make the work stand out.

skd-tree

in-memory indexing

multi-dimensional queries