Beyond Independent Genes: Learning Module-Inductive Representations for Gene Perturbation Prediction

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses a critical limitation in current gene perturbation response prediction methods, which typically neglect the coordinated regulatory programs among functionally related genes. To overcome this, we propose scBIG, a novel framework that departs from the conventional paradigm of modeling genes independently. scBIG introduces a module induction mechanism and dynamic program modeling: it first identifies co-regulated gene modules through relational clustering, then employs a module-aware encoder to capture inter-module interactions. Furthermore, by integrating structure-aware alignment and conditional flow matching, scBIG explicitly characterizes the reorganization and coordination of gene programs under perturbation. Evaluated across multiple single-cell perturbation benchmarks, scBIG consistently outperforms existing methods, achieving an average improvement of 6.7%—particularly in unseen and combinatorial perturbation scenarios.

Technology Category

Application Category

📝 Abstract

Predicting transcriptional responses to genetic perturbations is a central problem in functional genomics. In practice, perturbation responses are rarely gene-independent but instead manifest as coordinated, program-level transcriptional changes among functionally related genes. However, most existing methods do not explicitly model such coordination, due to gene-wise modeling paradigms and reliance on static biological priors that cannot capture dynamic program reorganization. To address these limitations, we propose scBIG, a module-inductive perturbation prediction framework that explicitly models coordinated gene programs. scBIG induces coherent gene programs from data via Gene-Relation Clustering, captures inter-program interactions through a Gene-Cluster-Aware Encoder, and preserves modular coordination using structure-aware alignment objectives. These structured representations are then modeled using conditional flow matching to enable flexible and generalizable perturbation prediction. Extensive experiments on multiple single-cell perturbation benchmarks show that scBIG consistently outperforms state-of-the-art methods, particularly on unseen and combinatorial perturbation settings, achieving an average improvement of 6.7% over the strongest baselines.

Problem

Research questions and friction points this paper is trying to address.

gene perturbation prediction

transcriptional response

coordinated gene programs

functional genomics

dynamic program reorganization

Innovation

Methods, ideas, or system contributions that make the work stand out.

module-inductive

gene program coordination

Gene-Relation Clustering