Building a Balanced k-d Tree in O(kn log n) Time

📅 2014-10-20
🏛️ arXiv.org
📈 Citations: 96
Influential: 5
📄 PDF

career value

231K/year
🤖 AI Summary
To address the high time complexity of conventional k-d tree construction—caused by recursive median-finding operations—this paper proposes an efficient balanced k-d tree construction algorithm based on multidimensional presorting. The method presorts coordinates along each of the k dimensions and maintains ordered index mappings, then integrates divide-and-conquer recursion with parallel thread scheduling to eliminate redundant sorting and median computations during tree construction. Theoretically, the algorithm achieves a time complexity of O(kn log n), representing the first balanced k-d tree construction method attaining this asymptotic bound. Experimental results demonstrate that the approach significantly outperforms the classic median-based recursive method when k ≤ 4—especially in 2D and 3D settings—and exhibits strong scalability under multithreaded execution. Consequently, it is well-suited for large-scale multidimensional spatial indexing applications.
📝 Abstract
The original description of the k-d tree recognized that rebalancing techniques, such as are used to build an AVL tree or a red-black tree, are not applicable to a k-d tree. Hence, in order to build a balanced k-d tree, it is necessary to find the median of the data for each recursive partition. The choice of selection or sort that is used to find the median for each subdivision strongly influences the computational complexity of building a k-d tree. This paper discusses an alternative algorithm that builds a balanced k-d tree by presorting the data in each of k dimensions prior to building the tree. It then preserves the order of these k sorts during tree construction and thereby avoids the requirement for any further sorting. Moreover, this algorithm is amenable to parallel execution via multiple threads. Compared to an algorithm that finds the median for each recursive subdivision, this presorting algorithm has equivalent performance for four dimensions and better performance for three or fewer dimensions.
Problem

Research questions and friction points this paper is trying to address.

High-dimensional Space
Balanced k-d Tree
Efficiency Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

k-d Tree Construction
Parallel Processing
Sorting Optimization