In-silico biological discovery with large perturbation models

📅 2025-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses key challenges in integrating multi-source heterogeneous perturbation data, weak mechanistic generalization, and limited cross-type knowledge transfer. We propose the Large Perturbation Model (LPM), the first framework to introduce a three-dimensional disentangled representation—“perturbation–readout–context”—enabling unified modeling of chemical and genetic perturbations and zero-shot response prediction. LPM integrates multimodal deep embedding, contrastive learning, and conditional generative modeling to achieve accurate transcriptomic prediction, cross-experiment mechanistic pattern discovery, and gene interaction inference. On benchmark tasks, LPM significantly outperforms state-of-the-art methods: it improves R² by 18% for unseen experimental transcriptomic prediction, increases chemical–genetic mechanism matching accuracy by 23%, and achieves an AUPRC of 0.41 for gene regulatory network inference.

Technology Category

Application Category

📝 Abstract
Data generated in perturbation experiments link perturbations to the changes they elicit and therefore contain information relevant to numerous biological discovery tasks -- from understanding the relationships between biological entities to developing therapeutics. However, these data encompass diverse perturbations and readouts, and the complex dependence of experimental outcomes on their biological context makes it challenging to integrate insights across experiments. Here, we present the Large Perturbation Model (LPM), a deep-learning model that integrates multiple, heterogeneous perturbation experiments by representing perturbation, readout, and context as disentangled dimensions. LPM outperforms existing methods across multiple biological discovery tasks, including in predicting post-perturbation transcriptomes of unseen experiments, identifying shared molecular mechanisms of action between chemical and genetic perturbations, and facilitating the inference of gene-gene interaction networks.
Problem

Research questions and friction points this paper is trying to address.

Integrating diverse perturbation data for biological discovery
Predicting post-perturbation outcomes in unseen experiments
Identifying shared molecular mechanisms across perturbations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep-learning model integrates heterogeneous perturbation experiments
Disentangles perturbation, readout, and context dimensions
Outperforms existing methods in biological discovery tasks
🔎 Similar Papers
No similar papers found.
Djordje Miladinovic
Djordje Miladinovic
Meta, Reality Labs
AImachine learningcomputer visiondrug discoveryhealthcare
T
Tobias Hoppe
GSK plc, Zug, Switzerland
M
Mathieu Chevalley
GSK plc, Zug, Switzerland
A
Andreas Georgiou
GSK plc, Zug, Switzerland
L
Lachlan Stuart
GSK plc, Zug, Switzerland
Arash Mehrjou
Arash Mehrjou
ETH Zürich - Max Planck Institute - GSK.ai
Machine LearningControl TheoryCausality
Marcus Bantscheff
Marcus Bantscheff
GSK plc, Zug, Switzerland
B
Bernhard Scholkopf
Max Planck Institute for Intelligent Systems & ELLIS Institute, Tübingen, Germany
Patrick Schwab
Patrick Schwab
GSK
Causal Machine LearningAI in Drug DiscoveryAI in HealthcareAI in Medicine