Refinement Contrastive Learning of Cell-Gene Associations for Unsupervised Cell Type Identification

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Unsupervised cell type identification faces significant challenges in distinguishing closely related cell types, primarily due to the neglect of biologically meaningful cell–gene associations in existing methods. To address this, we propose scRCL—a novel framework that jointly integrates dual-contrast distribution alignment with a gene-correlation-driven representation refinement module. scRCL explicitly models the synergistic cell–gene structural relationships and further incorporates gene co-expression priors with graph-augmented embedding learning. Evaluated on multiple benchmark single-cell RNA-seq and spatial transcriptomics datasets, scRCL consistently outperforms state-of-the-art methods, achieving absolute improvements of 3.2–9.7% in clustering accuracy. Moreover, the identified cell clusters exhibit strong biological coherence and yield interpretable, biologically meaningful gene expression signatures.

Technology Category

Application Category

📝 Abstract
Unsupervised cell type identification is crucial for uncovering and characterizing heterogeneous populations in single cell omics studies. Although a range of clustering methods have been developed, most focus exclusively on intrinsic cellular structure and ignore the pivotal role of cell-gene associations, which limits their ability to distinguish closely related cell types. To this end, we propose a Refinement Contrastive Learning framework (scRCL) that explicitly incorporates cell-gene interactions to derive more informative representations. Specifically, we introduce two contrastive distribution alignment components that reveal reliable intrinsic cellular structures by effectively exploiting cell-cell structural relationships. Additionally, we develop a refinement module that integrates gene-correlation structure learning to enhance cell embeddings by capturing underlying cell-gene associations. This module strengthens connections between cells and their associated genes, refining the representation learning to exploiting biologically meaningful relationships. Extensive experiments on several single-cell RNA-seq and spatial transcriptomics benchmark datasets demonstrate that our method consistently outperforms state-of-the-art baselines in cell-type identification accuracy. Moreover, downstream biological analyses confirm that the recovered cell populations exhibit coherent gene-expression signatures, further validating the biological relevance of our approach. The code is available at https://github.com/THPengL/scRCL.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised identification of cell types in single-cell studies
Incorporates cell-gene associations to improve clustering accuracy
Enhances representation learning for distinguishing closely related cells
Innovation

Methods, ideas, or system contributions that make the work stand out.

Refinement Contrastive Learning framework integrates cell-gene interactions
Two contrastive distribution alignment components exploit cell-cell relationships
Refinement module captures gene-correlation structure to enhance embeddings
🔎 Similar Papers
No similar papers found.
L
Liang Peng
Department of Computer Science, Shantou University
H
Haopeng Liu
Department of Computer Science, Shantou University
Yixuan Ye
Yixuan Ye
Data Scientist - Research, Google LLC
Statistical ModelingGenetic Prediction
C
Cheng Liu
College of Computer Science and Technology, Huaqiao University
W
Wenjun Shen
Shantou University Medical College, Shantou University
S
Si Wu
School of Computer Science and Engineering, South China University of Technology
Hau-San Wong
Hau-San Wong
Professor, Department of Computer Science, City University of Hong Kong
Artificial IntelligenceMachine learningData miningBioinformatics