StyleGallery: Training-free and Semantic-aware Personalized Style Transfer from Arbitrary Image References

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing diffusion-based style transfer methods, which often suffer from semantic misalignment, reliance on external constraints such as semantic masks, and the absence of adaptive global-local alignment mechanisms, thereby hindering precise and flexible personalization. The paper proposes a training-free, semantic-aware style transfer framework that achieves semantic-adaptive alignment with arbitrary style references without requiring any training. By leveraging adaptive clustering of latent diffusion features, block-wise filtered feature matching, and energy-guided regional style optimization, the method eliminates dependence on external annotations and enables interpretable control under multiple style references while preserving high-fidelity content-style fusion. Evaluated on a newly constructed benchmark, the approach significantly outperforms state-of-the-art methods in structural preservation, regional stylization quality, and personalized customization.

Technology Category

Application Category

📝 Abstract
Despite the advancements in diffusion-based image style transfer, existing methods are commonly limited by 1) semantic gap: the style reference could miss proper content semantics, causing uncontrollable stylization; 2) reliance on extra constraints (e.g., semantic masks) restricting applicability; 3) rigid feature associations lacking adaptive global-local alignment, failing to balance fine-grained stylization and global content preservation. These limitations, particularly the inability to flexibly leverage style inputs, fundamentally restrict style transfer in terms of personalization, accuracy, and adaptability. To address these, we propose StyleGallery, a training-free and semantic-aware framework that supports arbitrary reference images as input and enables effective personalized customization. It comprises three core stages: semantic region segmentation (adaptive clustering on latent diffusion features to divide regions without extra inputs); clustered region matching (block filtering on extracted features for precise alignment); and style transfer optimization (energy function-guided diffusion sampling with regional style loss to optimize stylization). Experiments on our introduced benchmark demonstrate that StyleGallery outperforms state-of-the-art methods in content structure preservation, regional stylization, interpretability, and personalized customization, particularly when leveraging multiple style references.
Problem

Research questions and friction points this paper is trying to address.

style transfer
semantic gap
personalization
diffusion models
content preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free
semantic-aware
style transfer
diffusion models
personalized customization
🔎 Similar Papers
No similar papers found.
B
Boyu He
College of Computer Science and Technology, National University of Defense Technology
Yunfan Ye
Yunfan Ye
National University of Defense Technology
Low-level VisionComputer GraphicsEdge Detection
C
Chang Liu
College of Computer Science and Technology, National University of Defense Technology
W
Weishang Wu
College of Computer Science and Technology, National University of Defense Technology
Fang Liu
Fang Liu
Professor, Hunan University
Service ComputingEdge ComputingBig data management and storageIntelligent Interaction Design
Z
Zhiping Cai
College of Computer Science and Technology, National University of Defense Technology