SplitFlux: Learning to Decouple Content and Style from a Single Image

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of limited customization quality stemming from difficult content-style disentanglement in single-image generation, this paper proposes a stage-wise controllable fine-tuning framework built upon the Flux diffusion model. Mechanistic analysis reveals that early unet blocks predominantly encode content, whereas later blocks govern style representation. Guided by this insight, we introduce Rank-Constrained Adaptation and Visual-Gated LoRA—two complementary techniques jointly optimized within a single Dream Block—to achieve low-rank parameter compression, magnitude amplification, and visual gating, thereby effectively suppressing content leakage while enabling precise detail re-embedding. Experiments demonstrate that our method significantly outperforms baselines such as SDXL in both content fidelity and style transfer quality, achieving state-of-the-art performance on cross-context style transfer and content re-editing tasks.

Technology Category

Application Category

📝 Abstract
Disentangling image content and style is essential for customized image generation. Existing SDXL-based methods struggle to achieve high-quality results, while the recently proposed Flux model fails to achieve effective content-style separation due to its underexplored characteristics. To address these challenges, we conduct a systematic analysis of Flux and make two key observations: (1) Single Dream Blocks are essential for image generation; and (2) Early single stream blocks mainly control content, whereas later blocks govern style. Based on these insights, we propose SplitFlux, which disentangles content and style by fine-tuning the single dream blocks via LoRA, enabling the disentangled content to be re-embedded into new contexts. It includes two key components: (1) Rank-Constrained Adaptation. To preserve content identity and structure, we compress the rank and amplify the magnitude of updates within specific blocks, preventing content leakage into style blocks. (2) Visual-Gated LoRA. We split the content LoRA into two branches with different ranks, guided by image saliency. The high-rank branch preserves primary subject information, while the low-rank branch encodes residual details, mitigating content overfitting and enabling seamless re-embedding. Extensive experiments demonstrate that SplitFlux consistently outperforms state-of-the-art methods, achieving superior content preservation and stylization quality across diverse scenarios.
Problem

Research questions and friction points this paper is trying to address.

Disentangling image content and style from single images
Addressing content-style separation limitations in Flux models
Enabling customized image generation with preserved content identity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes single dream blocks via LoRA
Uses rank-constrained adaptation to prevent leakage
Splits content LoRA with visual-gated branches
🔎 Similar Papers
2024-07-01arXiv.orgCitations: 3
Yitong Yang
Yitong Yang
Shanghai University of Finance and Economics
Y
Yinglin Wang
School of Computing and Artificial Intelligence, Shanghai University of Finance and Economics
C
Changshuo Wang
Department of Computer Science, University College London, University of London
Y
Yongjun Zhang
College of Computer Science and Technology, Guizhou University
Ziyang Chen
Ziyang Chen
Peking University
Quantum key distributionQuantum random number generation
Shuting He
Shuting He
Assistant Professor, Shanghai University of Finance and Economics
Computer Vision