🤖 AI Summary
Existing AI-based virtual cell (AIVC) methods predominantly model direct associations between perturbations and downstream RNA expression or morphological changes, neglecting the critical causal pathway RNA → morphology.
Method: We propose TRIDENT, the first framework to explicitly model RNA expression as a causal intermediate representation linking drug perturbations to cellular morphology, enabling high-fidelity morphology generation conditioned jointly on drugs and RNA. To support this, we introduce MorphoGene—the first paired dataset integrating L1000 transcriptomic profiles with Cell Painting morphological images—and design a trimodal cascaded generative model.
Results: TRIDENT significantly outperforms baselines across multiple compounds (up to 7× improvement in fidelity), exhibits strong generalization to unseen compounds, and accurately recapitulates expected phenotypes for docetaxel—validating both the biological plausibility of the RNA-mediated mechanism and the effectiveness of our causal modeling approach.
📝 Abstract
Accurately modeling the relationship between perturbations, transcriptional responses, and phenotypic changes is essential for building an AI Virtual Cell (AIVC). However, existing methods typically constrained to modeling direct associations, such as Perturbation $
ightarrow$ RNA or Perturbation $
ightarrow$ Morphology, overlook the crucial causal link from RNA to morphology. To bridge this gap, we propose TRIDENT, a cascade generative framework that synthesizes realistic cellular morphology by conditioning on both the perturbation and the corresponding gene expression profile. To train and evaluate this task, we construct MorphoGene, a new dataset pairing L1000 gene expression with Cell Painting images for 98 compounds. TRIDENT significantly outperforms state-of-the-art approaches, achieving up to 7-fold improvement with strong generalization to unseen compounds. In a case study on docetaxel, we validate that RNA-guided synthesis accurately produces the corresponding phenotype. An ablation study further confirms that this RNA conditioning is essential for the model's high fidelity. By explicitly modeling transcriptome-phenome mapping, TRIDENT provides a powerful in silico tool and moves us closer to a predictive virtual cell.