Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

📅 2025-03-11

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the challenge of modular, multi-concept customization in diffusion models. To enable zero-shot, on-the-fly fusion of heterogeneous concepts—including persons, objects, scenes, and artistic styles—we propose a novel framework that mitigates inter-concept interference and identity ambiguity. Methodologically, we introduce Randomized Output Erasure (ROE), a first-of-its-kind mechanism to suppress spurious cross-concept activations during generation, and design Blockwise LoRA Parameterization to preserve identity fidelity during parameter merging. Crucially, our approach requires no additional training. It achieves high-fidelity composition of up to 15 distinct concepts, outperforming state-of-the-art methods on concept stylization and multi-concept customization benchmarks. To our knowledge, this is the first method to realize large-scale, lossless, plug-and-play modular customization in diffusion models.

Technology Category

Application Category

📝 Abstract

Recent diffusion model customization has shown impressive results in incorporating subject or style concepts with a handful of images. However, the modular composition of multiple concepts into a customized model, aimed to efficiently merge decentralized-trained concepts without influencing their identities, remains unresolved. Modular customization is essential for applications like concept stylization and multi-concept customization using concepts trained by different users. Existing post-training methods are only confined to a fixed set of concepts, and any different combinations require a new round of retraining. In contrast, instant merging methods often cause identity loss and interference of individual merged concepts and are usually limited to a small number of concepts. To address these issues, we propose BlockLoRA, an instant merging method designed to efficiently combine multiple concepts while accurately preserving individual concepts' identity. With a careful analysis of the underlying reason for interference, we develop the Randomized Output Erasure technique to minimize the interference of different customized models. Additionally, Blockwise LoRA Parameterization is proposed to reduce the identity loss during instant model merging. Extensive experiments validate the effectiveness of BlockLoRA, which can instantly merge 15 concepts of people, subjects, scenes, and styles with high fidelity.

Problem

Research questions and friction points this paper is trying to address.

Modular composition of multiple concepts in diffusion models

Efficient merging of decentralized-trained concepts without identity loss

Minimizing interference and identity loss in instant model merging

Innovation

Methods, ideas, or system contributions that make the work stand out.

BlockLoRA enables instant merging of multiple concepts.

Randomized Output Erasure minimizes concept interference.

Blockwise LoRA Parameterization reduces identity loss.

🔎 Similar Papers

Low-Rank Continual Personalization of Diffusion Models