FantasyStyle: Controllable Stylized Distillation for 3D Gaussian Splatting

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address appearance distortion caused by multi-view inconsistency and content leakage induced by VGG feature dependency in 3D Gaussian Splatting (3DGS) style transfer, this paper proposes the first diffusion-model distillation–driven 3DGS stylization framework. Methodologically, we introduce a multi-view frequency consistency mechanism that enforces low-frequency cross-view coherence of noise latent variables via 3D filtering; and design controllable style distillation—incorporating negative prompting to suppress over-stylization and removing the reconstruction term to strengthen style dominance. Our contributions are threefold: (1) the first end-to-end diffusion distillation approach for 3D style transfer; (2) significantly improved geometric-style coherence across views; and (3) effective style-content disentanglement, mitigating structural distortion and content leakage. Experiments demonstrate superior rendering quality, enhanced realism, and improved stylistic controllability across diverse scenes and artistic styles.

Technology Category

Application Category

📝 Abstract
The success of 3DGS in generative and editing applications has sparked growing interest in 3DGS-based style transfer. However, current methods still face two major challenges: (1) multi-view inconsistency often leads to style conflicts, resulting in appearance smoothing and distortion; and (2) heavy reliance on VGG features, which struggle to disentangle style and content from style images, often causing content leakage and excessive stylization. To tackle these issues, we introduce extbf{FantasyStyle}, a 3DGS-based style transfer framework, and the first to rely entirely on diffusion model distillation. It comprises two key components: (1) extbf{Multi-View Frequency Consistency}. We enhance cross-view consistency by applying a 3D filter to multi-view noisy latent, selectively reducing low-frequency components to mitigate stylized prior conflicts. (2) extbf{Controllable Stylized Distillation}. To suppress content leakage from style images, we introduce negative guidance to exclude undesired content. In addition, we identify the limitations of Score Distillation Sampling and Delta Denoising Score in 3D style transfer and remove the reconstruction term accordingly. Building on these insights, we propose a controllable stylized distillation that leverages negative guidance to more effectively optimize the 3D Gaussians. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches, achieving higher stylization quality and visual realism across various scenes and styles.
Problem

Research questions and friction points this paper is trying to address.

Resolves multi-view inconsistency causing style conflicts and distortion
Reduces reliance on VGG features to prevent content leakage
Improves stylization quality via controllable diffusion model distillation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view frequency consistency for style conflict reduction
Controllable stylized distillation with negative guidance
Diffusion model distillation without reconstruction term
🔎 Similar Papers
No similar papers found.
Yitong Yang
Yitong Yang
Shanghai University of Finance and Economics
Y
Yinglin Wang
School of Computing and Artificial Intelligence, Shanghai University of Finance and Economics
C
Changshuo Wang
Department of Computer Science, University College London, University of London
H
Huajie Wang
Management Science and Engineering, Shandong University of Finance and Economics
Shuting He
Shuting He
Assistant Professor, Shanghai University of Finance and Economics
Computer Vision