Parameter-Free Clustering via Self-Supervised Consensus Maximization (Extended Version)

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Existing parameter-free clustering methods often struggle to robustly determine the number of clusters and lack dynamic modeling of structural consistency. Method: We propose SCMax—the first fully hyperparameter-free self-supervised hierarchical clustering framework—that jointly optimizes hierarchical clustering and nearest-neighbor consensus consistency via structure-aware self-supervised representation learning. At each merge step, SCMax quantifies consistency between the original input space and the self-supervised representation space, automatically identifying the optimal number of clusters at the peak of the consensus score. Contribution/Results: SCMax pioneers the deep integration of self-supervised learning with consensus maximization, enabling adaptive cluster-number estimation and simultaneous improvement in clustering quality. On multiple benchmark datasets, SCMax significantly outperforms state-of-the-art parameter-free methods, accurately recovering ground-truth cluster counts and achieving superior clustering performance.

Technology Category

Application Category

📝 Abstract

Clustering is a fundamental task in unsupervised learning, but most existing methods heavily rely on hyperparameters such as the number of clusters or other sensitive settings, limiting their applicability in real-world scenarios. To address this long-standing challenge, we propose a novel and fully parameter-free clustering framework via Self-supervised Consensus Maximization, named SCMax. Our framework performs hierarchical agglomerative clustering and cluster evaluation in a single, integrated process. At each step of agglomeration, it creates a new, structure-aware data representation through a self-supervised learning task guided by the current clustering structure. We then introduce a nearest neighbor consensus score, which measures the agreement between the nearest neighbor-based merge decisions suggested by the original representation and the self-supervised one. The moment at which consensus maximization occurs can serve as a criterion for determining the optimal number of clusters. Extensive experiments on multiple datasets demonstrate that the proposed framework outperforms existing clustering approaches designed for scenarios with an unknown number of clusters.

Problem

Research questions and friction points this paper is trying to address.

Eliminates dependency on hyperparameters like cluster count in clustering

Integrates hierarchical clustering with self-supervised consensus evaluation

Determines optimal cluster number automatically through consensus maximization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-free clustering via self-supervised consensus maximization

Integrated hierarchical agglomerative clustering with structure-aware representation

Nearest neighbor consensus score determines optimal cluster count

🔎 Similar Papers

No similar papers found.