Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls

๐Ÿ“… 2025-06-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of flexibly integrating novel control signals into pretrained one-step diffusion generative models. We propose Noise Consistency Training (NCT), a lightweight adaptation method that aligns generation behaviors across diverse control conditions in the noise spaceโ€”without accessing original training data or retraining the base model. NCT introduces a minimal, plug-in adapter module and is the first approach to achieve conditional distribution alignment in the noise domain for one-step generators. We provide theoretical guarantees showing that NCT reduces the divergence between the generated distribution and the target conditional distribution. Experiments demonstrate that NCT achieves state-of-the-art controllable generation performance in a single forward pass, surpassing existing multi-step and knowledge-distillation-based methods in both image quality and inference efficiency. NCT is modular, data-efficient, and eliminates the need for full model retraining.

Technology Category

Application Category

๐Ÿ“ Abstract
The pursuit of efficient and controllable high-quality content generation remains a central challenge in artificial intelligence-generated content (AIGC). While one-step generators, enabled by diffusion distillation techniques, offer excellent generation quality and computational efficiency, adapting them to new control conditions--such as structural constraints, semantic guidelines, or external inputs--poses a significant challenge. Conventional approaches often necessitate computationally expensive modifications to the base model and subsequent diffusion distillation. This paper introduces Noise Consistency Training (NCT), a novel and lightweight approach to directly integrate new control signals into pre-trained one-step generators without requiring access to original training images or retraining the base diffusion model. NCT operates by introducing an adapter module and employs a noise consistency loss in the noise space of the generator. This loss aligns the adapted model's generation behavior across noises that are conditionally dependent to varying degrees, implicitly guiding it to adhere to the new control. Theoretically, this training objective can be understood as minimizing the distributional distance between the adapted generator and the conditional distribution induced by the new conditions. NCT is modular, data-efficient, and easily deployable, relying only on the pre-trained one-step generator and a control signal model. Extensive experiments demonstrate that NCT achieves state-of-the-art controllable generation in a single forward pass, surpassing existing multi-step and distillation-based methods in both generation quality and computational efficiency. Code is available at https://github.com/Luo-Yihong/NCT
Problem

Research questions and friction points this paper is trying to address.

Enhancing one-step generators with new control signals
Avoiding expensive model modifications and retraining
Achieving efficient and high-quality controllable content generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight adapter for control integration
Noise consistency loss in generator space
Modular training without original data
๐Ÿ”Ž Similar Papers