Supervised Graph Contrastive Learning for Gene Regulatory Network

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph contrastive learning (GCL) methods applied to gene regulatory networks (GRNs) neglect biologically realistic perturbations—such as gene knockdowns—leading to representations lacking interpretability and biological relevance. To address this, we propose SupGCL, the first supervised GCL framework that explicitly incorporates gene knockdown experiments as supervisory signals into contrastive learning, yielding an interpretable, probabilistic contrastive model. SupGCL integrates perturbation-driven data augmentation, probabilistic graph modeling, and multi-task downstream adaptation—supporting both graph-level and node-level tasks. Evaluated on GRNs derived from multiple cancer patient cohorts, SupGCL consistently outperforms state-of-the-art methods across patient risk prediction, disease subtype classification, and gene functional annotation. Our approach establishes a novel paradigm for biologically grounded representation learning on regulatory networks.

Technology Category

Application Category

📝 Abstract
Graph representation learning is effective for obtaining a meaningful latent space utilizing the structure of graph data and is widely applied, including biological networks. In particular, Graph Contrastive Learning (GCL) has emerged as a powerful self-supervised method that relies on applying perturbations to graphs for data augmentation. However, when applying existing GCL methods to biological networks such as Gene Regulatory Networks (GRNs), they overlooked meaningful biologically relevant perturbations, e.g., gene knockdowns. In this study, we introduce SupGCL (Supervised Graph Contrastive Learning), a novel GCL method for GRNs that directly incorporates biological perturbations derived from gene knockdown experiments as the supervision. SupGCL mathematically extends existing GCL methods that utilize non-biological perturbations to probabilistic models that introduce actual biological gene perturbation utilizing gene knockdown data. Using the GRN representation obtained by our proposed method, our aim is to improve the performance of biological downstream tasks such as patient hazard prediction and disease subtype classification (graph-level task), and gene function classification (node-level task). We applied SupGCL on real GRN datasets derived from patients with multiple types of cancer, and in all experiments SupGCL achieves better performance than state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Incorporates gene knockdown data into graph contrastive learning
Improves biological network representation for downstream tasks
Enhances performance in cancer patient hazard prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Incorporates gene knockdown data as supervision
Extends GCL with probabilistic biological models
Improves performance in biological downstream tasks
🔎 Similar Papers
S
Sho Oshima
Graduate School of Medicine, Kyoto University
Y
Yuji Okamoto
Graduate School of Medicine, Kyoto University
T
Taisei Tosaki
Graduate School of Medicine, Kyoto University
Ryosuke Kojima
Ryosuke Kojima
Associate Professor, Kyoto University / Team leader, RIKEN BDR
Artificial IntelligenceRobot AuditionProbabilistic Programming
Y
Yasushi Okuno
Graduate School of Medicine, Kyoto University