🤖 AI Summary
Existing multi-concept customization methods for text-to-image generation suffer from attribute entanglement when fusing multiple LoRA models, necessitating separate fine-tuning to preserve concept distinguishability. This work proposes the first unified LoRA fusion framework that requires no additional fine-tuning. Its core innovation is a contrastive learning–driven weight-space alignment mechanism: by constructing positive and negative sample pairs, it optimizes the mapping of LoRA parameters to enable interference-free, high-discriminability end-to-end integration of multiple concepts within a shared diffusion model. The method jointly incorporates LoRA adaptation, contrastive loss regularization, and diffusion model fine-tuning. Experiments demonstrate substantial improvements in image fidelity and concept controllability for cross-scene compositional generation, while rigorously maintaining the semantic independence of each personalized concept.
📝 Abstract
Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.