A Training-Free Style-Personalization via Scale-wise Autoregressive Model

📅 2025-07-06

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

This work addresses the challenges of personalized style control and content-style disentanglement in image stylization without additional training. We propose an inference-time stylization method built upon pretrained diffusion models, featuring a three-branch prompt-guidance mechanism and a scale-autoregressive generation paradigm to explicitly decouple content and style during inference. To reveal the dominant roles of early-to-mid generation stages in structuring content and encoding style, we introduce key-stage attention sharing and adaptive query sharing. Fine-grained collaborative control is achieved via step-level and attention-level interventions, joint prompt-feature injection, and query similarity fusion. Experiments demonstrate that our approach matches fine-tuning methods in style fidelity and prompt alignment, achieves significantly faster inference, and exhibits strong cross-style generalization and deployment flexibility.

Technology Category

Application Category

📝 Abstract

We present a training-free framework for style-personalized image generation that controls content and style information during inference using a scale-wise autoregressive model. Our method employs a three-path design--content, style, and generation--each guided by a corresponding text prompt, enabling flexible and efficient control over image semantics without any additional training. A central contribution of this work is a step-wise and attention-wise intervention analysis. Through systematic prompt and feature injection, we find that early-to-middle generation steps play a pivotal role in shaping both content and style, and that query features predominantly encode content-specific information. Guided by these insights, we introduce two targeted mechanisms: Key Stage Attention Sharing, which aligns content and style during the semantically critical steps, and Adaptive Query Sharing, which reinforces content semantics in later steps through similarity-aware query blending. Extensive experiments demonstrate that our method achieves competitive style fidelity and prompt fidelity compared to fine-tuned baselines, while offering faster inference and greater deployment flexibility.

Problem

Research questions and friction points this paper is trying to address.

Training-free style-personalized image generation control

Step-wise intervention analysis for content and style shaping

Key mechanisms for competitive fidelity without fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free style-personalization via autoregressive model

Key Stage Attention Sharing for content-style alignment

Adaptive Query Sharing for content semantics reinforcement

🔎 Similar Papers

StyleShot: A Snapshot on Any Style