Scalable Varied-Density Clustering via Graph Propagation

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

To address the failure and poor scalability of conventional density-based clustering methods on high-dimensional data with significant local density variations, this paper proposes a scalable variable-density clustering framework. The method constructs a density-adaptive approximate neighborhood graph—built via random projection—and formulates clustering as a density-aware label propagation process over this graph, implicitly enforcing intra-cluster density consistency through graph connectivity. By bypassing explicit density estimation and mitigating parameter sensitivity, the approach enables sublinear-time approximate nearest neighbor construction and efficient graph diffusion. Evaluated on million-scale datasets, it completes clustering in minutes while matching state-of-the-art accuracy, yet with substantially reduced computational overhead. The framework thus offers both theoretical soundness—rooted in density-aware graph modeling—and practical applicability for large-scale, high-dimensional clustering tasks.

Technology Category

Application Category

📝 Abstract

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based clustering with graph connectivity, enabling the use of efficient graph propagation techniques developed in network science. To ensure scalability, we introduce a density-aware neighborhood propagation algorithm and leverage advanced random projection methods to construct approximate neighborhood graphs. Our approach significantly reduces computational cost while preserving clustering quality. Empirically, it scales to datasets with millions of points in minutes and achieves competitive accuracy compared to existing baselines.

Problem

Research questions and friction points this paper is trying to address.

Clustering high-dimensional data with varying densities

Connecting density-based clustering with graph connectivity

Scaling to large datasets efficiently without losing accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Label propagation in adaptive neighborhood graphs

Density-aware neighborhood propagation algorithm

Approximate graphs via random projection methods

🔎 Similar Papers

No similar papers found.