🤖 AI Summary
To address the challenge of modeling nonlinear hierarchical relationships in tree-structured data within Euclidean space, this paper proposes Hyperbolic Gaussian-blurred Mean Shift (H-GBMS), the first density-based clustering framework operating in hyperbolic space. Methodologically, we generalize Gaussian-blurred Mean Shift to hyperbolic geometry by incorporating hyperbolic distance metrics and Möbius-weighted means, ensuring geometric consistency; we theoretically establish its convergence and computational complexity, and devise an adaptive bandwidth estimation strategy. Our key contribution is the first unification of hierarchy-aware and density-driven clustering in hyperbolic space. Extensive experiments on 11 real-world datasets—particularly those with tree-like structures—demonstrate that H-GBMS significantly outperforms classical Euclidean Mean Shift in both clustering quality and robustness, validating its effectiveness in non-Euclidean settings.
📝 Abstract
Clustering is a fundamental unsupervised learning task for uncovering patterns in data. While Gaussian Blurring Mean Shift (GBMS) has proven effective for identifying arbitrarily shaped clusters in Euclidean space, it struggles with datasets exhibiting hierarchical or tree-like structures. In this work, we introduce HypeGBMS, a novel extension of GBMS to hyperbolic space. Our method replaces Euclidean computations with hyperbolic distances and employs Möbius-weighted means to ensure that all updates remain consistent with the geometry of the space. HypeGBMS effectively captures latent hierarchies while retaining the density-seeking behavior of GBMS. We provide theoretical insights into convergence and computational complexity, along with empirical results that demonstrate improved clustering quality in hierarchical datasets. This work bridges classical mean-shift clustering and hyperbolic representation learning, offering a principled approach to density-based clustering in curved spaces. Extensive experimental evaluations on $11$ real-world datasets demonstrate that HypeGBMS significantly outperforms conventional mean-shift clustering methods in non-Euclidean settings, underscoring its robustness and effectiveness.