LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

📅 2024-12-12
🏛️ Computer Vision and Pattern Recognition
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
Existing multi-concept customization methods for text-to-image generation suffer from attribute entanglement when fusing multiple LoRA models, necessitating separate fine-tuning to preserve concept distinguishability. This work proposes the first unified LoRA fusion framework that requires no additional fine-tuning. Its core innovation is a contrastive learning–driven weight-space alignment mechanism: by constructing positive and negative sample pairs, it optimizes the mapping of LoRA parameters to enable interference-free, high-discriminability end-to-end integration of multiple concepts within a shared diffusion model. The method jointly incorporates LoRA adaptation, contrastive loss regularization, and diffusion model fine-tuning. Experiments demonstrate substantial improvements in image fidelity and concept controllability for cross-scene compositional generation, while rigorously maintaining the semantic independence of each personalized concept.

Technology Category

Application Category

📝 Abstract
Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.
Problem

Research questions and friction points this paper is trying to address.

Combining multiple personalized models without attribute entanglement
Merging distinct LoRA models into a unified model efficiently
Ensuring compatibility and minimizing interference in multi-concept synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRACLR merges multiple LoRA models without fine-tuning
Uses contrastive objective to align weight spaces
Enables scalable multi-concept image synthesis efficiently
🔎 Similar Papers
No similar papers found.