Clustering via Self-Supervised Diffusion

📅 2025-07-06

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work introduces diffusion models to unsupervised clustering for the first time, addressing the longstanding challenge of jointly optimizing feature discriminability and clustering robustness. We propose DiffCluster: a framework that extracts high-dimensional semantic features using a pretrained Vision Transformer (ViT), then employs a self-supervised diffusion model as a teacher network to generate diverse, structure-preserving pseudo-labels via iterative denoising. A student network distills these outputs to achieve stable cluster assignments. Crucially, we reinterpret the diffusion process as an implicit data augmentation mechanism and an uncertainty-aware clustering prior, significantly enhancing resilience to input noise, intra-class variation, and complex manifold structures. Evaluated on standard benchmarks—including CIFAR-10/100 and ImageNet-Dogs—DiffCluster achieves state-of-the-art performance, improving average clustering accuracy by 3.2% while demonstrating superior generalization and robustness.

Technology Category

Application Category

📝 Abstract

Diffusion models, widely recognized for their success in generative tasks, have not yet been applied to clustering. We introduce Clustering via Diffusion (CLUDI), a self-supervised framework that combines the generative power of diffusion models with pre-trained Vision Transformer features to achieve robust and accurate clustering. CLUDI is trained via a teacher-student paradigm: the teacher uses stochastic diffusion-based sampling to produce diverse cluster assignments, which the student refines into stable predictions. This stochasticity acts as a novel data augmentation strategy, enabling CLUDI to uncover intricate structures in high-dimensional data. Extensive evaluations on challenging datasets demonstrate that CLUDI achieves state-of-the-art performance in unsupervised classification, setting new benchmarks in clustering robustness and adaptability to complex data distributions.

Problem

Research questions and friction points this paper is trying to address.

Applying diffusion models to clustering tasks for the first time

Combining diffusion models with Vision Transformer for robust clustering

Enhancing clustering accuracy via stochastic teacher-student data augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised diffusion model for clustering

Teacher-student paradigm with stochastic diffusion

Combines diffusion models with Vision Transformer

🔎 Similar Papers

Diffusion Map Autoencoder