Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
This work addresses the limitations of existing controllable diffusion methods, which are often confined to specific backbone models and suffer from incompatible training pipelines, inconsistent parameter formats, and non-reusable runtime hooks, thereby hindering cross-task and cross-backbone capability transfer. To overcome these challenges, the authors propose a unified plugin framework that decouples base model inference from controllable capability injection. For the first time at the system level, they define a generic controllability interface that enables heterogeneous capability carriers—such as KV-Cache and LoRA—to coexist under a common abstraction. The framework employs three core components—template model, template cache, and template pipeline—to map task inputs into a standardized intermediate representation, facilitating dynamic loading and fusion of multiple control signals. Experiments across more than ten controllable generation tasks demonstrate that the approach maintains high generation quality while offering strong modularity, composability, and cross-model compatibility.

Technology Category

Application Category

📝 Abstract
Controllable diffusion methods have substantially expanded the practical utility of diffusion models, but they are typically developed as isolated, backbone-specific systems with incompatible training pipelines, parameter formats, and runtime hooks. This fragmentation makes it difficult to reuse infrastructure across tasks, transfer capabilities across backbones, or compose multiple controls within a single generation pipeline. We present Diffusion Templates, a unified and open plugin framework that decouples base-model inference from controllable capability injection. The framework is organized around three components: Template models that map arbitrary task-specific inputs to an intermediate capability representation, a Template cache that functions as a standardized interface for capability injection, and a Template pipeline that loads, merges, and injects one or more Template caches into the base diffusion runtime. Because the interface is defined at the systems level rather than tied to a specific control architecture, heterogeneous capability carriers such as KV-Cache and LoRA can be supported under the same abstraction. Based on this design, we build a diverse model zoo spanning structural control, brightness adjustment, color adjustment, image editing, super-resolution, sharpness enhancement, aesthetic alignment, content reference, local inpainting, and age control. These case studies show that Diffusion Templates can unify a broad range of controllable generation tasks while preserving modularity, composability, and practical extensibility across rapidly evolving diffusion backbones. All resources will be open sourced, including code, models, and datasets.
Problem

Research questions and friction points this paper is trying to address.

controllable diffusion
fragmentation
backbone-specific systems
capability composition
incompatible pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion Templates
controllable diffusion
plugin framework
capability injection
modular composition
🔎 Similar Papers
No similar papers found.