scAGC: Learning Adaptive Cell Graphs with Contrastive Guidance for Single-Cell Clustering

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of graph-structure noise sensitivity, long-tailed distribution modeling, and high-dimensional sparsity in single-cell RNA sequencing (scRNA-seq) clustering, this paper proposes the Topology-Adaptive Graph Autoencoder (TAGAE). TAGAE jointly learns dynamic graph structures end-to-end via differentiable Gumbel-Softmax graph sampling and contrastive learning constraints, while enhancing feature robustness through zero-inflated negative binomial (ZINB) reconstruction loss. Extensive experiments on nine benchmark scRNA-seq datasets demonstrate that TAGAE consistently outperforms state-of-the-art methods—achieving superior performance in normalized mutual information (NMI) on all nine datasets and adjusted Rand index (ARI) on seven out of nine. To our knowledge, TAGAE is the first method to unify graph structure learning, contrastive guidance, and long-tail-aware modeling within a self-supervised clustering framework, thereby significantly improving both accuracy and generalizability of cell type annotation.

Technology Category

Application Category

📝 Abstract
Accurate cell type annotation is a crucial step in analyzing single-cell RNA sequencing (scRNA-seq) data, which provides valuable insights into cellular heterogeneity. However, due to the high dimensionality and prevalence of zero elements in scRNA-seq data, traditional clustering methods face significant statistical and computational challenges. While some advanced methods use graph neural networks to model cell-cell relationships, they often depend on static graph structures that are sensitive to noise and fail to capture the long-tailed distribution inherent in single-cell populations.To address these limitations, we propose scAGC, a single-cell clustering method that learns adaptive cell graphs with contrastive guidance. Our approach optimizes feature representations and cell graphs simultaneously in an end-to-end manner. Specifically, we introduce a topology-adaptive graph autoencoder that leverages a differentiable Gumbel-Softmax sampling strategy to dynamically refine the graph structure during training. This adaptive mechanism mitigates the problem of a long-tailed degree distribution by promoting a more balanced neighborhood structure. To model the discrete, over-dispersed, and zero-inflated nature of scRNA-seq data, we integrate a Zero-Inflated Negative Binomial (ZINB) loss for robust feature reconstruction. Furthermore, a contrastive learning objective is incorporated to regularize the graph learning process and prevent abrupt changes in the graph topology, ensuring stability and enhancing convergence. Comprehensive experiments on 9 real scRNA-seq datasets demonstrate that scAGC consistently outperforms other state-of-the-art methods, yielding the best NMI and ARI scores on 9 and 7 datasets, respectively.Our code is available at Anonymous Github.
Problem

Research questions and friction points this paper is trying to address.

Accurate cell type annotation in scRNA-seq data analysis
Overcoming high dimensionality and zero-inflated data challenges
Improving graph-based clustering with adaptive structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive cell graphs with contrastive guidance learning
Differentiable Gumbel-Softmax sampling for dynamic refinement
ZINB loss and contrastive learning for robust reconstruction
🔎 Similar Papers
No similar papers found.
Huifa Li
Huifa Li
East China Normal University
Deep LearningGraph Neural NetworkLLMAI4Science
J
Jie Fu
Department of Computer Science, Stevens Institute of Technology, Hoboken, USA
X
Xinlin Zhuang
Department of Computational Biology, Mohamed bin Zayed University of AI, Abu Dhabi, UAE
Haolin Yang
Haolin Yang
University of Chicago
large language modelsnatural language processing
Xinpeng Ling
Xinpeng Ling
Tongji University
Federated LearningDifferential PrivacyConvex Optimization
T
Tong Cheng
Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, China
H
Haochen Xue
Department of Computational Biology, Mohamed bin Zayed University of AI, Abu Dhabi, UAE
Imran Razzak
Imran Razzak
MBZUAI, Abu Dhabi
Human-Centered AIMedical Image AnalysisMedical Artificial IntelligenceComputational Biology
Z
Zhili Chen
Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, China