SplitFlux: Learning to Decouple Content and Style from a Single Image

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address the challenge of limited customization quality stemming from difficult content-style disentanglement in single-image generation, this paper proposes a stage-wise controllable fine-tuning framework built upon the Flux diffusion model. Mechanistic analysis reveals that early unet blocks predominantly encode content, whereas later blocks govern style representation. Guided by this insight, we introduce Rank-Constrained Adaptation and Visual-Gated LoRA—two complementary techniques jointly optimized within a single Dream Block—to achieve low-rank parameter compression, magnitude amplification, and visual gating, thereby effectively suppressing content leakage while enabling precise detail re-embedding. Experiments demonstrate that our method significantly outperforms baselines such as SDXL in both content fidelity and style transfer quality, achieving state-of-the-art performance on cross-context style transfer and content re-editing tasks.

Technology Category

Application Category

📝 Abstract

Disentangling image content and style is essential for customized image generation. Existing SDXL-based methods struggle to achieve high-quality results, while the recently proposed Flux model fails to achieve effective content-style separation due to its underexplored characteristics. To address these challenges, we conduct a systematic analysis of Flux and make two key observations: (1) Single Dream Blocks are essential for image generation; and (2) Early single stream blocks mainly control content, whereas later blocks govern style. Based on these insights, we propose SplitFlux, which disentangles content and style by fine-tuning the single dream blocks via LoRA, enabling the disentangled content to be re-embedded into new contexts. It includes two key components: (1) Rank-Constrained Adaptation. To preserve content identity and structure, we compress the rank and amplify the magnitude of updates within specific blocks, preventing content leakage into style blocks. (2) Visual-Gated LoRA. We split the content LoRA into two branches with different ranks, guided by image saliency. The high-rank branch preserves primary subject information, while the low-rank branch encodes residual details, mitigating content overfitting and enabling seamless re-embedding. Extensive experiments demonstrate that SplitFlux consistently outperforms state-of-the-art methods, achieving superior content preservation and stylization quality across diverse scenarios.

Problem

Research questions and friction points this paper is trying to address.

Disentangling image content and style from single images

Addressing content-style separation limitations in Flux models

Enabling customized image generation with preserved content identity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes single dream blocks via LoRA

Uses rank-constrained adaptation to prevent leakage

Splits content LoRA with visual-gated branches

🔎 Similar Papers

StyleShot: A Snapshot on Any Style