CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion

📅 2026-02-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge in personalized image generation where existing LoRA composition methods struggle to balance content fidelity and style consistency, often resulting in entangled content-style representations, weak controllability, and unstable fusion. To overcome these limitations, the authors propose a training-free decoupled fusion framework that separates content and style subspaces through rank-constrained fine-tuning. They introduce a prompt-guided multi-branch expert encoder to enable semantically controllable adapter aggregation and incorporate a classifier-free temporal coherence guidance mechanism to enhance generation stability. This approach achieves, for the first time, retraining-free disentangled LoRA fusion that simultaneously preserves high-fidelity content and supports flexible semantic control, significantly outperforming current state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Personalized image generation requires effectively balancing content fidelity with stylistic consistency when synthesizing images based on text and reference examples. Low-Rank Adaptation (LoRA) offers an efficient personalization approach, with potential for precise control through combining LoRA weights on different concepts. However, existing combination techniques face persistent challenges: entanglement between content and style representations, insufficient guidance for controlling elements' influence, and unstable weight fusion that often require additional training. We address these limitations through CRAFT-LoRA, with complementary components: (1) rank-constrained backbone fine-tuning that injects low-rank projection residuals to encourage learning decoupled content and style subspaces; (2) a prompt-guided approach featuring an expert encoder with specialized branches that enables semantic extension and precise control through selective adapter aggregation; and (3) a training-free, timestep-dependent classifier-free guidance scheme that enhances generation stability by strategically adjusting noise predictions across diffusion steps. Our method significantly improves content-style disentanglement, enables flexible semantic control over LoRA module combinations, and achieves high-fidelity generation without additional retraining overhead.

Problem

Research questions and friction points this paper is trying to address.

personalized image generation

content-style disentanglement

LoRA combination

style consistency

content fidelity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rank-Constrained Adaptation

Training-Free Fusion

Content-Style Disentanglement

LoRA Personalization

Diffusion Guidance

🔎 Similar Papers

No similar papers found.

Authors to Follow