Universal Few-Shot Spatial Control for Diffusion Models

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing spatial conditional control adapters exhibit poor generalization to novel control tasks and incur high training costs. This paper proposes Universal Few-shot Control (UFC), a plug-and-play spatial control framework for pretrained diffusion models (e.g., UNet or DiT), which leverages analogical matching between support sets and query conditions. UFC enables the first few-shot universal adaptation to *unseen* control tasks: it achieves full-supervision performance on six new tasks using only 30 annotated samples, and remains highly competitive even with just 0.1% of the full training data. Its core innovation lies in constructing task-adaptive control features via a lightweight matching mechanism—introducing minimal learnable parameters while ensuring computational efficiency, broad task generalizability, and seamless compatibility with diverse diffusion architectures.

Technology Category

Application Category

📝 Abstract
Spatial conditioning in pretrained text-to-image diffusion models has significantly improved fine-grained control over the structure of generated images. However, existing control adapters exhibit limited adaptability and incur high training costs when encountering novel spatial control conditions that differ substantially from the training tasks. To address this limitation, we propose Universal Few-Shot Control (UFC), a versatile few-shot control adapter capable of generalizing to novel spatial conditions. Given a few image-condition pairs of an unseen task and a query condition, UFC leverages the analogy between query and support conditions to construct task-specific control features, instantiated by a matching mechanism and an update on a small set of task-specific parameters. Experiments on six novel spatial control tasks show that UFC, fine-tuned with only 30 annotated examples of novel tasks, achieves fine-grained control consistent with the spatial conditions. Notably, when fine-tuned with 0.1% of the full training data, UFC achieves competitive performance with the fully supervised baselines in various control tasks. We also show that UFC is applicable agnostically to various diffusion backbones and demonstrate its effectiveness on both UNet and DiT architectures. Code is available at https://github.com/kietngt00/UFC.
Problem

Research questions and friction points this paper is trying to address.

Adapting diffusion models to novel spatial control conditions
Reducing training costs for few-shot spatial conditioning
Generalizing control adapters across diverse diffusion architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal Few-Shot Control adapter
Generalizes to novel spatial conditions
Works with UNet and DiT architectures
🔎 Similar Papers
No similar papers found.