Updatable Balanced Index for Fast On-device Search with Auto-selection Model

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high construction overhead, inflexible updates, and unstable query performance of Balanced Multi-way KD-trees (BMKD-trees) on edge devices, this paper proposes an efficiently updatable balanced multi-way KD-tree index. Methodologically: (i) it predicts optimal splitting hyperplanes based on data distribution to accelerate tree construction; (ii) it introduces a selective subtree reconstruction mechanism enabling low-overhead real-time insertions; and (iii) it develops a lightweight query-adaptive model that dynamically optimizes k-nearest neighbor (kNN) and range query strategies. Experiments on 2D/3D sensor data show that our method achieves 17.96× faster construction, 1.60× faster insertion, 7.15× speedup in kNN search, and 1.09× improvement in range query throughput over conventional BMKD-trees; in data simplification tasks, it outperforms the Lloyd algorithm by 217×. The core contribution is a lightweight, balanced indexing framework integrating distribution-aware construction, localized incremental updates, and adaptive query strategy selection.

Technology Category

Application Category

📝 Abstract
Diverse types of edge data, such as 2D geo-locations and 3D point clouds, are collected by sensors like lidar and GPS receivers on edge devices. On-device searches, such as k-nearest neighbor (kNN) search and radius search, are commonly used to enable fast analytics and learning technologies, such as k-means dataset simplification using kNN. To maintain high search efficiency, a representative approach is to utilize a balanced multi-way KD-tree (BMKD-tree). However, the index has shown limited gains, mainly due to substantial construction overhead, inflexibility to real-time insertion, and inconsistent query performance. In this paper, we propose UnIS to address the above limitations. We first accelerate the construction process of the BMKD-tree by utilizing the dataset distribution to predict the splitting hyperplanes. To make the continuously generated data searchable, we propose a selective sub-tree rebuilding scheme to accelerate rebalancing during insertion by reducing the number of data points involved. We then propose an auto-selection model to improve query performance by automatically selecting the optimal search strategy among multiple strategies for an arbitrary query task. Experimental results show that UnIS achieves average speedups of 17.96x in index construction, 1.60x in insertion, 7.15x in kNN search, and 1.09x in radius search compared to the BMKD-tree. We further verify its effectiveness in accelerating dataset simplification on edge devices, achieving a speedup of 217x over Lloyd's algorithm.
Problem

Research questions and friction points this paper is trying to address.

Accelerates balanced index construction for on-device search systems
Enables efficient real-time data insertion with selective rebalancing
Improves query performance through automatic search strategy selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Accelerates BMKD-tree construction using dataset distribution prediction
Uses selective sub-tree rebuilding for efficient real-time insertion
Implements auto-selection model for optimal query strategy choice
🔎 Similar Papers
No similar papers found.
Yushuai Ji
Yushuai Ji
Wuhan University
Vector SearchClustering AlgorithmVector Database
S
Sheng Wang
School of Computer Science, Wuhan University
Zhiyu Chen
Zhiyu Chen
Amazon
Conversational AILarge Language ModelsInformation RetrievalNatural language Processing
Y
Yuan Sun
La Trobe Business School, La Trobe University
Z
Zhiyong Peng
School of Computer Science, Wuhan University