Test-Time Compositional Generalization in Diffusion Models via Concept Discovery

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

186K/year
🤖 AI Summary
This work addresses the challenge of enabling pre-trained diffusion models to generate novel compositions of familiar parts at test time without relying on a predefined concept library. The authors analyze the geometric structure of the score function across different noise levels, identifying query-relevant local density patterns and mapping them to Gaussian prototypes in clean data space. They then construct a Product-of-Experts (PoE) teacher model that integrates a submodular likelihood objective for prototype selection and enables analytic sampling. By leveraging the temporal structure of the diffusion process, this approach achieves compositional generalization without any pre-specified concept inventory, significantly outperforming baselines that depend solely on the query or nearest training categories on the ColorMNIST and CelebA compositional benchmarks.
📝 Abstract
Compositional generalization requires models to produce novel configurations from familiar parts. In diffusion models, prior compositional generation methods typically assume that the relevant concepts or conditioning signals are already available. We instead ask whether a pretrained diffusion model can discover query-specific concepts from the time-indexed scores it learns for the noisy marginals $p_t(x_t)$ and compose them at test time. Given a single out-of-distribution query, our method performs gradient ascent on $s_θ(x_t,t) \approx \nabla_{x_t}\log p_t(x_t)$ at multiple noising timesteps to recover local density modes, maps these modes into clean-space Gaussians, greedily selects relevant prototypes with a submodular likelihood objective, and combines them into a product-of-experts (PoE) teacher model with an analytic score. This teacher model can be sampled directly through classifier-free guidance or used to generate a sample pool for training a new class embedding and low-rank adapter. On held-out composition benchmarks built from ColorMNIST and CelebA, both the analytic PoE sampler and the low-rank adapted model outperform query-only and nearest trained-class baselines. These results suggest that the time-indexed score geometry of the diffusion model contains reusable density-mode concepts that support test-time compositional generation without a predefined concept library.
Problem

Research questions and friction points this paper is trying to address.

compositional generalization
diffusion models
concept discovery
test-time adaptation
out-of-distribution generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

test-time compositional generalization
concept discovery
diffusion models
product-of-experts
score-based generative modeling
🔎 Similar Papers
2023-11-03Neural Information Processing SystemsCitations: 24