Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

To address limited diversity in conditional image generation caused by insufficient modeling of input uncertainty, this paper proposes Rainbow: the first framework to integrate Generative Flow Networks (GFlowNets) into conditional generation. Rainbow constructs a parameterized latent graph to encode multimodal, implicit semantic representations of the condition and samples diverse trajectories in the latent space, thereby guiding pre-trained generative models to produce rich, semantically coherent, and interpretable images. Crucially, it disentangles uncertainty directly from a single condition—without relying on random seed perturbation or prompt engineering. Experiments on natural and medical imaging datasets demonstrate that Rainbow significantly improves diversity (+32.7% LPIPS) and fidelity (+18.4% FID) across image synthesis, generation, and counterfactual reasoning tasks. This work establishes a novel paradigm for uncertainty-aware, controllable generation.

Technology Category

Application Category

📝 Abstract

Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can lead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.

Problem

Research questions and friction points this paper is trying to address.

Generating diverse images from uncertain input conditions

Decomposing input conditions into distinct latent representations

Improving diversity and fidelity in conditional image generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposing input conditions into diverse latent representations

Integrating GFlowNet-parameterized latent graphs for prompt computation

Leveraging GFlowNet sampling for diverse trajectory generation

🔎 Similar Papers

No similar papers found.