๐ค AI Summary
To address the high computational cost and poor real-time interactivity of clustering-based labeling in large-scale point cloud embedding projections, this paper proposes a lightweight clustering method based on two-dimensional kernel density estimation (KDE). The core innovation lies in reformulating point-level clustering as density map analysis: adaptive KDE models spatial point distributions, while density peak detection coupled with geometry-aware thresholding enables rapid cluster identification. This work is the first to systematically replace point-wise operations with density field analysis in embedding visualization contexts. On standard benchmark datasets, our method achieves clustering in under 300 msโaccelerating existing approaches by two to three orders of magnitude. It supports millisecond-level responsive interactive labeling and semantic summarization, significantly enhancing visual interpretability and user navigation efficiency.
๐ Abstract
Interactive visualization of embedding projections is a useful technique for understanding data and evaluating machine learning models. Labeling data within these visualizations is critical for interpretation, as labels provide an overview of the projection and guide user navigation. However, most methods for producing labels require clustering the points, which can be computationally expensive as the number of points grows. In this paper, we describe an efficient clustering approach using kernel density estimation in the projected 2D space instead of points. This algorithm can produce high-quality cluster regions from a 2D density map in a few hundred milliseconds, orders of magnitude faster than current approaches. We contribute the design of the algorithm, benchmarks, and applications that demonstrate the utility of the algorithm, including labeling and summarization.