🤖 AI Summary
Spectral clustering suffers from two fundamental bottlenecks: suboptimal graph cut optimization and limited representational capacity. To address these, we propose BootSC—the first end-to-end deep spectral clustering framework that jointly models similarity learning, spectral embedding, and cluster assignment. Our key contributions are: (1) leveraging optimal transport to generate self-guided supervisory signals, enabling progressive optimization in a fully unsupervised setting; and (2) introducing a semantically consistent, orthogonal reparameterized embedding layer that enhances feature discriminability and structural interpretability. Evaluated on challenging benchmarks including ImageNet-Dogs, BootSC achieves state-of-the-art performance, outperforming the second-best method by 16% in Normalized Mutual Information (NMI). This demonstrates substantial improvements in both end-to-end modeling fidelity and practical clustering effectiveness for spectral clustering.
📝 Abstract
Spectral clustering is a leading clustering method. Two of its major shortcomings are the disjoint optimization process and the limited representation capacity. To address these issues, we propose a deep spectral clustering model (named BootSC), which jointly learns all stages of spectral clustering -- affinity matrix construction, spectral embedding, and $k$-means clustering -- using a single network in an end-to-end manner. BootSC leverages effective and efficient optimal-transport-derived supervision to bootstrap the affinity matrix and the cluster assignment matrix. Moreover, a semantically-consistent orthogonal re-parameterization technique is introduced to orthogonalize spectral embeddings, significantly enhancing the discrimination capability. Experimental results indicate that BootSC achieves state-of-the-art clustering performance. For example, it accomplishes a notable 16% NMI improvement over the runner-up method on the challenging ImageNet-Dogs dataset. Our code is available at https://github.com/spdj2271/BootSC.