VCWorld: A Biological World Model for Virtual Cell Simulation

📅 2025-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current virtual cell models rely heavily on large-scale single-cell data, limiting their generalizability due to data quality issues and batch effects; moreover, they are predominantly black-box systems, lacking interpretability and biological consistency. To address these limitations, we propose the first biologically grounded world model framework for cellular response prediction. Our approach integrates knowledge graphs, chain-of-causal reasoning from large language models, and dynamic simulation of signaling pathways to construct an interpretable, data-efficient white-box cellular simulator. The model enables stepwise mechanistic inference and hypothesis generation under molecular perturbations. In drug perturbation prediction tasks, it achieves state-of-the-art performance; inferred signaling pathways exhibit strong concordance with established biological evidence. This advances the scientific credibility and mechanistic interpretability of virtual cell models, enabling rigorous, hypothesis-driven discovery in systems pharmacology and cell biology.

Technology Category

Application Category

📝 Abstract
Virtual cell modeling aims to predict cellular responses to perturbations. Existing virtual cell models rely heavily on large-scale single-cell datasets, learning explicit mappings between gene expression and perturbations. Although recent models attempt to incorporate multi-source biological information, their generalization remains constrained by data quality, coverage, and batch effects. More critically, these models often function as black boxes, offering predictions without interpretability or consistency with biological principles, which undermines their credibility in scientific research. To address these challenges, we present VCWorld, a cell-level white-box simulator that integrates structured biological knowledge with the iterative reasoning capabilities of large language models to instantiate a biological world model. VCWorld operates in a data-efficient manner to reproduce perturbation-induced signaling cascades and generates interpretable, stepwise predictions alongside explicit mechanistic hypotheses. In drug perturbation benchmarks, VCWorld achieves state-of-the-art predictive performance, and the inferred mechanistic pathways are consistent with publicly available biological evidence.
Problem

Research questions and friction points this paper is trying to address.

Develops a white-box simulator for virtual cell modeling
Integrates biological knowledge with large language models
Generates interpretable predictions and mechanistic hypotheses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates structured biological knowledge with large language models
Operates data-efficiently to reproduce signaling cascades
Generates interpretable stepwise predictions with mechanistic hypotheses
🔎 Similar Papers
No similar papers found.
Z
Zhijian Wei
Shanghai Jiao Tong University
R
Runze Ma
Shanghai Jiao Tong University
Z
Zichen Wang
Shanghai Jiao Tong University
Z
Zhongmin Li
Shanghai Jiao Tong University
S
Shuotong Song
Shanghai Jiao Tong University
Shuangjia Zheng
Shuangjia Zheng
Shanghai Jiao Tong University
Generative AIDrug DiscoverySynthetic BiologyMulti-Agent System