🤖 AI Summary
Existing anchor-based multi-view clustering methods lack principled optimization foundations and suffer from non-transparent architectural design, particularly in large-scale settings. Method: This paper proposes an optimization-driven deep unfolding network that systematically decomposes the iterative anchor clustering process into three learnable modules—representation learning, noise-robust reconstruction, and anchor indicator estimation—and introduces an unsupervised multi-view reconstruction loss to enforce cross-view consistency. Each module is endowed with explicit optimization semantics, ensuring both interpretability and scalability. Contribution/Results: Evaluated on multiple large-scale benchmarks, the method achieves significant improvements in clustering accuracy while maintaining linear time complexity, outperforming current state-of-the-art approaches.
📝 Abstract
Deep anchor-based multi-view clustering methods enhance the scalability of neural networks by utilizing representative anchors to reduce the computational complexity of large-scale clustering. Despite their scalability advantages, existing approaches often incorporate anchor structures in a heuristic or task-agnostic manner, either through post-hoc graph construction or as auxiliary components for message passing. Such designs overlook the core structural demands of anchor-based clustering, neglecting key optimization principles. To bridge this gap, we revisit the underlying optimization problem of large-scale anchor-based multi-view clustering and unfold its iterative solution into a novel deep network architecture, termed LargeMvC-Net. The proposed model decomposes the anchor-based clustering process into three modules: RepresentModule, NoiseModule, and AnchorModule, corresponding to representation learning, noise suppression, and anchor indicator estimation. Each module is derived by unfolding a step of the original optimization procedure into a dedicated network component, providing structural clarity and optimization traceability. In addition, an unsupervised reconstruction loss aligns each view with the anchor-induced latent space, encouraging consistent clustering structures across views. Extensive experiments on several large-scale multi-view benchmarks show that LargeMvC-Net consistently outperforms state-of-the-art methods in terms of both effectiveness and scalability.