Compressing Biology: Evaluating the Stable Diffusion VAE for Phenotypic Drug Discovery

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-throughput phenotypic screening using Cell Painting microscopy images faces challenges in high dimensionality and model adaptability, necessitating rigorous evaluation of generative models for biological fidelity. Method: We systematically assess Stable Diffusion Variational Autoencoders (SD-VAEs) for reconstructing Cell Painting images and preserving phenotypic signals. We introduce the first multi-level evaluation framework tailored to microscopic phenotypic data, integrating pixel-level, embedding-space, latent-space, and cross-modal retrieval metrics. Notably, we employ general-purpose feature extractors—e.g., InceptionV3—instead of domain-specific models to validate biological relevance. Contribution/Results: SD-VAEs achieve efficient compression and high-fidelity reconstruction while robustly preserving molecular perturbation- and cell-type-specific phenotypic signatures. General-purpose feature extractors match or surpass specialized models in phenotypic retrieval tasks. This work establishes a reproducible, standardized assessment protocol and practical paradigm for deploying off-the-shelf generative models credibly in drug discovery.

Technology Category

Application Category

📝 Abstract
High-throughput phenotypic screens generate vast microscopy image datasets that push the limits of generative models due to their large dimensionality. Despite the growing popularity of general-purpose models trained on natural images for microscopy data analysis, their suitability in this domain has not been quantitatively demonstrated. We present the first systematic evaluation of Stable Diffusion's variational autoencoder (SD-VAE) for reconstructing Cell Painting images, assessing performance across a large dataset with diverse molecular perturbations and cell types. We find that SD-VAE reconstructions preserve phenotypic signals with minimal loss, supporting its use in microscopy workflows. To benchmark reconstruction quality, we compare pixel-level, embedding-based, latent-space, and retrieval-based metrics for a biologically informed evaluation. We show that general-purpose feature extractors like InceptionV3 match or surpass publicly available bespoke models in retrieval tasks, simplifying future pipelines. Our findings offer practical guidelines for evaluating generative models on microscopy data and support the use of off-the-shelf models in phenotypic drug discovery.
Problem

Research questions and friction points this paper is trying to address.

Evaluating Stable Diffusion VAE for Cell Painting image reconstruction
Assessing general-purpose models' suitability for microscopy data analysis
Benchmarking reconstruction quality with biologically informed metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating Stable Diffusion VAE for microscopy reconstruction
Using general-purpose feature extractors for biological retrieval
Applying off-the-shelf models in phenotypic drug discovery
T
Télio Cropsal
Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Gothenburg, SE
Rocío Mercado
Rocío Mercado
Chalmers University of Technology
molecular engineeringmachine learningdeep generative modelsdrug discoverymaterials discovery