TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization

📅 2024-08-19
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Density-based clustering suffers from over-reliance on local structures and insufficient global characterization, leading to spurious peak detection and high sensitivity to threshold parameters. To address these issues, this paper proposes a typicality-aware non-local pattern optimization framework. Its key contributions are: (1) a novel global typicality-based dependency recalibration mechanism that eliminates reliance on local density estimation; (2) integration of path connectivity measurement with graph-cut optimization to achieve centerless, adaptive cluster partitioning; and (3) an efficient typicality estimation algorithm coupled with a non-local density propagation strategy. Evaluated on 16 real-world and multiple synthetic benchmarks, the method consistently outperforms mainstream algorithms—including DBSCAN and MeanShift—in clustering accuracy and robustness. It exhibits strong noise resilience and is fully parameter-free, requiring no manual hyperparameter tuning.

Technology Category

Application Category

📝 Abstract
Density-based clustering methods by mode-seeking usually achieve clustering by using local density estimation to mine structural information, such as local dependencies from lower density points to higher neighbors. However, they often rely too heavily on emph{local} structures and neglect emph{global} characteristics, which can lead to significant errors in peak selection and dependency establishment. Although introducing more hyperparameters that revise dependencies can help mitigate this issue, tuning them is challenging and even impossible on real-world datasets. In this paper, we propose a new algorithm (TANGO) to establish local dependencies by exploiting a global-view emph{typicality} of points, which is obtained by mining further the density distributions and initial dependencies. TANGO then obtains sub-clusters with the help of the adjusted dependencies, and characterizes the similarity between sub-clusters by incorporating path-based connectivity. It achieves final clustering by employing graph-cut on sub-clusters, thus avoiding the challenging selection of cluster centers. Moreover, this paper provides theoretical analysis and an efficient method for the calculation of typicality. Experimental results on several synthetic and $16$ real-world datasets demonstrate the effectiveness and superiority of TANGO.
Problem

Research questions and friction points this paper is trying to address.

Improves mode-seeking by using global typicality instead of local thresholds
Introduces TANGO for clustering with typicality-aware nonlocal mode-seeking
Enhances clustering via graph-cut optimization and path-based similarity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses typicality for global-view mode detection
Employs graph-cut with path-based similarity
Combines nonlocal mode-seeking and optimization
🔎 Similar Papers
No similar papers found.