Beyond Independent Genes: Learning Module-Inductive Representations for Gene Perturbation Prediction

๐Ÿ“… 2026-02-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses a critical limitation in current gene perturbation response prediction methods, which typically neglect the coordinated regulatory programs among functionally related genes. To overcome this, we propose scBIG, a novel framework that departs from the conventional paradigm of modeling genes independently. scBIG introduces a module induction mechanism and dynamic program modeling: it first identifies co-regulated gene modules through relational clustering, then employs a module-aware encoder to capture inter-module interactions. Furthermore, by integrating structure-aware alignment and conditional flow matching, scBIG explicitly characterizes the reorganization and coordination of gene programs under perturbation. Evaluated across multiple single-cell perturbation benchmarks, scBIG consistently outperforms existing methods, achieving an average improvement of 6.7%โ€”particularly in unseen and combinatorial perturbation scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Predicting transcriptional responses to genetic perturbations is a central problem in functional genomics. In practice, perturbation responses are rarely gene-independent but instead manifest as coordinated, program-level transcriptional changes among functionally related genes. However, most existing methods do not explicitly model such coordination, due to gene-wise modeling paradigms and reliance on static biological priors that cannot capture dynamic program reorganization. To address these limitations, we propose scBIG, a module-inductive perturbation prediction framework that explicitly models coordinated gene programs. scBIG induces coherent gene programs from data via Gene-Relation Clustering, captures inter-program interactions through a Gene-Cluster-Aware Encoder, and preserves modular coordination using structure-aware alignment objectives. These structured representations are then modeled using conditional flow matching to enable flexible and generalizable perturbation prediction. Extensive experiments on multiple single-cell perturbation benchmarks show that scBIG consistently outperforms state-of-the-art methods, particularly on unseen and combinatorial perturbation settings, achieving an average improvement of 6.7% over the strongest baselines.
Problem

Research questions and friction points this paper is trying to address.

gene perturbation prediction
transcriptional response
coordinated gene programs
functional genomics
dynamic program reorganization
Innovation

Methods, ideas, or system contributions that make the work stand out.

module-inductive
gene program coordination
Gene-Relation Clustering
conditional flow matching
structure-aware alignment
๐Ÿ”Ž Similar Papers
J
Jiafa Ruan
ReLER, CCAI, Zhejiang University
Ruijie Quan
Ruijie Quan
Nanyang Technological University
MultimodalComputer VisionSequence ModelingAI4Science
Z
Zongxin Yang
DBMI, HMS, Harvard University
L
Liyang Xu
ReLER, CCAI, Zhejiang University
Yi Yang
Yi Yang
Zhejiang University
multimediacomputer visionmachine learning