π€ AI Summary
Unsupervised clustering of hyperspectral images (HSIs) suffers from insufficient exploitation of 3D structural information and misalignment between superpixel representation learning and clustering objectives. To address these challenges, we propose the Superpixel Graph Contrastive Clustering (SPGCC) frameworkβthe first to deeply integrate graph contrastive learning with superpixel structures for clustering. SPGCC introduces HSI-specific semantic-invariant augmentations (pixel sampling and weight perturbation), a hybrid 3D/2D convolutional backbone pretrained for spectral-spatial feature extraction, and a dual-level contrastive mechanism aligning both sample pairs and cluster centroids. Furthermore, it employs an alternating optimization strategy that jointly refines representation learning and clustering. Extensive experiments on multiple benchmark HSI datasets demonstrate consistent state-of-the-art performance in accuracy (ACC), normalized mutual information (NMI), and adjusted rand index (ARI). The source code is publicly available.
π Abstract
Hyperspectral images (HSI) clustering is an important but challenging task. The state-of-the-art (SOTA) methods usually rely on superpixels, however, they do not fully utilize the spatial and spectral information in HSI 3-D structure, and their optimization targets are not clustering-oriented. In this work, we first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spatial and spectral features of HSI through pre-training, and then design a superpixel graph contrastive clustering (SPGCC) model to learn discriminative superpixel representations. Reasonable augmented views are crucial for contrastive clustering, and conventional contrastive learning may hurt the cluster structure since different samples are pushed away in the embedding space even if they belong to the same class. In SPGCC, we design two semantic-invariant data augmentations for HSI superpixels: pixel sampling augmentation and model weight augmentation. Then sample-level alignment and clustering-center-level contrast are performed for better intra-class similarity and inter-class dissimilarity of superpixel embeddings. We perform clustering and network optimization alternatively. Experimental results on several HSI datasets verify the advantages of the proposed SPGCC compared to SOTA methods. Our code is available at https://github.com/jhqi/spgcc.