Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Microscopy image analysis exhibits poor generalization in high-throughput perturbation screening across de novo cell lines, primarily due to morphological and biological heterogeneity that hinders model transferability. To address this, we propose a knowledge-guided disentangled representation learning framework: (1) a biologically grounded knowledge graph—constructed from protein–protein interaction networks—is leveraged for pretraining to encode perturbation-specific representations; (2) single-cell transcriptomic features are integrated to model cell-line-specific representations, enforcing orthogonality between the two factors via disentanglement. Evaluated on the RxRx dataset, our method significantly improves few-shot and single-shot fine-tuning performance and enhances robustness of image representations for unseen cell lines. The core innovation lies in embedding structured biological priors—namely, domain-specific knowledge graphs and single-cell multi-omics data—into deep learning architectures, enabling, for the first time, interpretable disentanglement of perturbation effects from cellular context. This establishes a generalizable, interpretable phenotypic analysis paradigm for real-world drug screening.

Technology Category

Application Category

📝 Abstract
High-throughput screening techniques, such as microscopy imaging of cellular responses to genetic and chemical perturbations, play a crucial role in drug discovery and biomedical research. However, robust perturbation screening for extit{de novo} cell lines remains challenging due to the significant morphological and biological heterogeneity across cell lines. To address this, we propose a novel framework that integrates external biological knowledge into existing pretraining strategies to enhance microscopy image profiling models. Our approach explicitly disentangles perturbation-specific and cell line-specific representations using external biological information. Specifically, we construct a knowledge graph leveraging protein interaction data from STRING and Hetionet databases to guide models toward perturbation-specific features during pretraining. Additionally, we incorporate transcriptomic features from single-cell foundation models to capture cell line-specific representations. By learning these disentangled features, our method improves the generalization of imaging models to extit{de novo} cell lines. We evaluate our framework on the RxRx database through one-shot fine-tuning on an RxRx1 cell line and few-shot fine-tuning on cell lines from the RxRx19a dataset. Experimental results demonstrate that our method enhances microscopy image profiling for extit{de novo} cell lines, highlighting its effectiveness in real-world phenotype-based drug discovery applications.
Problem

Research questions and friction points this paper is trying to address.

Enhance microscopy image profiling for de novo cell lines
Disentangle perturbation and cell line-specific features
Improve generalization using biological knowledge graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates biological knowledge for image profiling
Disentangles perturbation and cell line features
Uses knowledge graphs and transcriptomic data
🔎 Similar Papers
No similar papers found.
J
Jiayuan Chen
The Ohio State University
Thai-Hoang Pham
Thai-Hoang Pham
Ohio State University
Trustworthy AINatural Language ProcessingMachine LearningBioinformaticsHealth Informatics
Y
Yuanlong Wang
The Ohio State University
P
Ping Zhang
The Ohio State University