Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models

πŸ“… 2025-11-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Single-cell gene expression generation faces dual challenges: modeling count-based data characteristics and capturing complex inter-gene dependencies. Existing methods often rely on arbitrary gene ordering assumptions or shallow network architectures, leading to distorted expression profiles. To address this, we propose scLDMβ€”the first permutation-invariant latent diffusion framework for single-cell transcriptomics. scLDM employs unified multi-head cross-attention to achieve permutation-invariant pooling in the encoder and permutation-equivariant upsampling in the decoder. It further integrates a latent-space Diffusion Transformer, linear interpolation for efficient trajectory modeling, and multi-condition classifier-free guidance to effectively capture high-dimensional gene dependencies. Evaluated on both observational and perturbation datasets, scLDM achieves superior generative fidelity and sets new state-of-the-art performance in downstream tasks such as cell-type classification. These results demonstrate its scalability, biological plausibility, and robustness across diverse experimental settings.

Technology Category

Application Category

πŸ“ Abstract
Computational modeling of single-cell gene expression is crucial for understanding cellular processes, but generating realistic expression profiles remains a major challenge. This difficulty arises from the count nature of gene expression data and complex latent dependencies among genes. Existing generative models often impose artificial gene orderings or rely on shallow neural network architectures. We introduce a scalable latent diffusion model for single-cell gene expression data, which we refer to as scLDM, that respects the fundamental exchangeability property of the data. Our VAE uses fixed-size latent variables leveraging a unified Multi-head Cross-Attention Block (MCAB) architecture, which serves dual roles: permutation-invariant pooling in the encoder and permutation-equivariant unpooling in the decoder. We enhance this framework by replacing the Gaussian prior with a latent diffusion model using Diffusion Transformers and linear interpolants, enabling high-quality generation with multi-conditional classifier-free guidance. We show its superior performance in a variety of experiments for both observational and perturbational single-cell data, as well as downstream tasks like cell-level classification.
Problem

Research questions and friction points this paper is trying to address.

Generating realistic single-cell gene expression profiles remains challenging
Existing models impose artificial gene orderings or shallow architectures
Capturing count nature and latent dependencies in expression data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent diffusion models generate single-cell gene expression
Multi-head Cross-Attention enables permutation-invariant architecture
Diffusion Transformers replace Gaussian prior for enhanced generation
πŸ”Ž Similar Papers
No similar papers found.
Giovanni Palla
Giovanni Palla
Chan Zuckerberg Initiative
Computational biologyMachine Learning
S
Sudarshan Babu
Chan Zuckerberg Biohub, Chicago, IL
P
Payam Dibaeinia
Chan Zuckerberg Initiative, Redwood City, CA
J
James D Pearce
Chan Zuckerberg Initiative, Redwood City, CA
Donghui Li
Donghui Li
Chan Zuckerberg Initiative, Redwood City, CA
A
Aly A. Khan
Chan Zuckerberg Biohub, Chicago, IL
Theofanis Karaletsos
Theofanis Karaletsos
Head of AI, CZI-Science | Achira.ai
Generative AIAI x ScienceProbabilistic Modeling
Jakub M. Tomczak
Jakub M. Tomczak
Chan Zuckerberg Initiative
Machine LearningDeep LearningGenerative ModelsGenerative AI