๐ค AI Summary
This work addresses a critical limitation in current gene perturbation response prediction methods, which typically neglect the coordinated regulatory programs among functionally related genes. To overcome this, we propose scBIG, a novel framework that departs from the conventional paradigm of modeling genes independently. scBIG introduces a module induction mechanism and dynamic program modeling: it first identifies co-regulated gene modules through relational clustering, then employs a module-aware encoder to capture inter-module interactions. Furthermore, by integrating structure-aware alignment and conditional flow matching, scBIG explicitly characterizes the reorganization and coordination of gene programs under perturbation. Evaluated across multiple single-cell perturbation benchmarks, scBIG consistently outperforms existing methods, achieving an average improvement of 6.7%โparticularly in unseen and combinatorial perturbation scenarios.
๐ Abstract
Predicting transcriptional responses to genetic perturbations is a central problem in functional genomics. In practice, perturbation responses are rarely gene-independent but instead manifest as coordinated, program-level transcriptional changes among functionally related genes. However, most existing methods do not explicitly model such coordination, due to gene-wise modeling paradigms and reliance on static biological priors that cannot capture dynamic program reorganization. To address these limitations, we propose scBIG, a module-inductive perturbation prediction framework that explicitly models coordinated gene programs. scBIG induces coherent gene programs from data via Gene-Relation Clustering, captures inter-program interactions through a Gene-Cluster-Aware Encoder, and preserves modular coordination using structure-aware alignment objectives. These structured representations are then modeled using conditional flow matching to enable flexible and generalizable perturbation prediction. Extensive experiments on multiple single-cell perturbation benchmarks show that scBIG consistently outperforms state-of-the-art methods, particularly on unseen and combinatorial perturbation settings, achieving an average improvement of 6.7% over the strongest baselines.