In-silico biological discovery with large perturbation models

📅 2025-03-30

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This study addresses key challenges in integrating multi-source heterogeneous perturbation data, weak mechanistic generalization, and limited cross-type knowledge transfer. We propose the Large Perturbation Model (LPM), the first framework to introduce a three-dimensional disentangled representation—“perturbation–readout–context”—enabling unified modeling of chemical and genetic perturbations and zero-shot response prediction. LPM integrates multimodal deep embedding, contrastive learning, and conditional generative modeling to achieve accurate transcriptomic prediction, cross-experiment mechanistic pattern discovery, and gene interaction inference. On benchmark tasks, LPM significantly outperforms state-of-the-art methods: it improves R² by 18% for unseen experimental transcriptomic prediction, increases chemical–genetic mechanism matching accuracy by 23%, and achieves an AUPRC of 0.41 for gene regulatory network inference.

Technology Category

Application Category

📝 Abstract

Data generated in perturbation experiments link perturbations to the changes they elicit and therefore contain information relevant to numerous biological discovery tasks -- from understanding the relationships between biological entities to developing therapeutics. However, these data encompass diverse perturbations and readouts, and the complex dependence of experimental outcomes on their biological context makes it challenging to integrate insights across experiments. Here, we present the Large Perturbation Model (LPM), a deep-learning model that integrates multiple, heterogeneous perturbation experiments by representing perturbation, readout, and context as disentangled dimensions. LPM outperforms existing methods across multiple biological discovery tasks, including in predicting post-perturbation transcriptomes of unseen experiments, identifying shared molecular mechanisms of action between chemical and genetic perturbations, and facilitating the inference of gene-gene interaction networks.

Problem

Research questions and friction points this paper is trying to address.

Integrating diverse perturbation data for biological discovery

Predicting post-perturbation outcomes in unseen experiments

Identifying shared molecular mechanisms across perturbations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep-learning model integrates heterogeneous perturbation experiments

Disentangles perturbation, readout, and context dimensions

Outperforms existing methods in biological discovery tasks

🔎 Similar Papers

A deep graph model for the signed interaction prediction in biological network