SPARKE: Scalable Prompt-Aware Diversity Guidance in Diffusion Models via RKE Score

📅 2025-06-11

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address insufficient sample diversity in prompt-guided diffusion models under semantically similar prompts, this paper proposes a scalable prompt-aware diversity guidance method. The core technique introduces a conditional latent-variable Rényi kernel entropy (RKE) score as a diversity metric—enabling, for the first time, a dynamic, prompt-aware entropy-guided sampling mechanism. Through theoretical derivation, the computational complexity of entropy estimation and optimization is reduced from O(n³) to O(n). The method integrates prompt embedding similarity modeling, gradient-guided sampling, and conditional-entropy-based diversity assessment. Extensive experiments on multiple text-to-image diffusion models demonstrate significant improvements in generation diversity under semantically similar prompts, while supporting real-time, low-overhead guidance across thousands of prompt iterations.

Technology Category

Application Category

📝 Abstract

Diffusion models have demonstrated remarkable success in high-fidelity image synthesis and prompt-guided generative modeling. However, ensuring adequate diversity in generated samples of prompt-guided diffusion models remains a challenge, particularly when the prompts span a broad semantic spectrum and the diversity of generated data needs to be evaluated in a prompt-aware fashion across semantically similar prompts. Recent methods have introduced guidance via diversity measures to encourage more varied generations. In this work, we extend the diversity measure-based approaches by proposing the Scalable Prompt-Aware R'eny Kernel Entropy Diversity Guidance (SPARKE) method for prompt-aware diversity guidance. SPARKE utilizes conditional entropy for diversity guidance, which dynamically conditions diversity measurement on similar prompts and enables prompt-aware diversity control. While the entropy-based guidance approach enhances prompt-aware diversity, its reliance on the matrix-based entropy scores poses computational challenges in large-scale generation settings. To address this, we focus on the special case of Conditional latent RKE Score Guidance, reducing entropy computation and gradient-based optimization complexity from the $O(n^3)$ of general entropy measures to $O(n)$. The reduced computational complexity allows for diversity-guided sampling over potentially thousands of generation rounds on different prompts. We numerically test the SPARKE method on several text-to-image diffusion models, demonstrating that the proposed method improves the prompt-aware diversity of the generated data without incurring significant computational costs. We release our code on the project page: https://mjalali.github.io/SPARKE

Problem

Research questions and friction points this paper is trying to address.

Ensuring diversity in prompt-guided diffusion model outputs

Reducing computational complexity of entropy-based diversity guidance

Enhancing prompt-aware diversity without significant computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-aware diversity guidance via RKE score

Scalable entropy computation reduced to O(n)

Dynamic diversity control for similar prompts

🔎 Similar Papers

Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models