Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Quantifying epistemic uncertainty in text-to-image diffusion models remains challenging due to the lack of scalable, training-free methods. Method: We propose Epistemic Mixture of Experts (EMoE), a zero-shot framework that estimates epistemic uncertainty by analyzing multi-step denoising trajectories in the latent space of pre-trained diffusion models—requiring no additional training or architectural modification. Contribution/Results: EMoE enables efficient, scalable uncertainty modeling for billion-parameter diffusion models for the first time. Evaluated on COCO, it demonstrates strong correlation between estimated uncertainty and generated image quality. Moreover, it reveals systematic uncertainty biases toward low-resource languages and underrepresented geographic categories—uncovering implicit data biases in training corpora. By providing an interpretable, post-hoc diagnostic tool, EMoE advances fairness, accountability, and trustworthiness in generative AI systems.

Technology Category

Application Category

📝 Abstract

Estimating uncertainty in text-to-image diffusion models is challenging because of their large parameter counts (often exceeding 100 million) and operation in complex, high-dimensional spaces with virtually infinite input possibilities. In this paper, we propose Epistemic Mixture of Experts (EMoE), a novel framework for efficiently estimating epistemic uncertainty in diffusion models. EMoE leverages pre-trained networks without requiring additional training, enabling direct uncertainty estimation from a prompt. We leverage a latent space within the diffusion process that captures epistemic uncertainty better than existing methods. Experimental results on the COCO dataset demonstrate EMoE's effectiveness, showing a strong correlation between uncertainty and image quality. Additionally, EMoE identifies under-sampled languages and regions with higher uncertainty, revealing hidden biases in the training set. This capability demonstrates the relevance of EMoE as a tool for addressing fairness and accountability in AI-generated content.

Problem

Research questions and friction points this paper is trying to address.

Estimating epistemic uncertainty in large text-to-image diffusion models

Identifying hidden biases in training sets via uncertainty analysis

Enhancing fairness in AI-generated content through uncertainty quantification

Innovation

Methods, ideas, or system contributions that make the work stand out.

EMoE estimates epistemic uncertainty efficiently

Leverages pre-trained networks without additional training

Uses latent space for better uncertainty capture

🔎 Similar Papers

Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion