Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the sensitivity of latent diffusion models to sampling perturbations, which often arises from variance collapse in the latent space and leads to degraded generation quality. For the first time, this study explicitly identifies the critical role of sampling perturbation robustness in generative performance and proposes a variance-expansion loss to learn perturbation-robust latent representations while preserving high reconstruction fidelity. The method achieves an adaptive balance through an adversarial trade-off between reconstruction accuracy and latent variance, optimized within a β-VAE encoder framework coupled with diffusion-based sampling. Extensive experiments demonstrate consistent improvements in generation quality across diverse latent diffusion architectures, confirming that enhancing robustness in the latent space effectively stabilizes and elevates image synthesis performance.

Technology Category

Application Category

📝 Abstract

Latent diffusion models have emerged as the dominant framework for high-fidelity and efficient image generation, owing to their ability to learn diffusion processes in compact latent spaces. However, while previous research has focused primarily on reconstruction accuracy and semantic alignment of the latent space, we observe that another critical factor, robustness to sampling perturbations, also plays a crucial role in determining generation quality. Through empirical and theoretical analyses, we show that the commonly used $β$-VAE-based tokenizers in latent diffusion models, tend to produce overly compact latent manifolds that are highly sensitive to stochastic perturbations during diffusion sampling, leading to visual degradation. To address this issue, we propose a simple yet effective solution that constructs a latent space robust to sampling perturbations while maintaining strong reconstruction fidelity. This is achieved by introducing a Variance Expansion loss that counteracts variance collapse and leverages the adversarial interplay between reconstruction and variance expansion to achieve an adaptive balance that preserves reconstruction accuracy while improving robustness to stochastic sampling. Extensive experiments demonstrate that our approach consistently enhances generation quality across different latent diffusion architectures, confirming that robustness in latent space is a key missing ingredient for stable and faithful diffusion sampling.

Problem

Research questions and friction points this paper is trying to address.

latent diffusion models

sampling perturbations

variance collapse

latent space robustness

image generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variance Expansion Loss

Latent Diffusion Models

Sampling Perturbation Robustness