FPGS: Feed-Forward Semantic-aware Photorealistic Style Transfer of Large-Scale Gaussian Splatting

πŸ“… 2025-03-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of achieving high-quality, multi-view consistent, and real-time style transfer in large-scale 3D scenes. We propose FPGSβ€”the first optimization-free, feed-forward, semantics-aware 3D Gaussian Splatting style transfer framework. Methodologically, FPGS introduces a novel style-decoupled 3D feature field coupled with a local AdaIN fusion mechanism, enabling user-controllable style injection into radiance field representations via semantic correspondence matching across multiple reference images. Unlike prior approaches, it requires neither per-style re-optimization nor retraining, and supports flexible collaborative control using arbitrary single or multiple reference images. Extensive evaluation on large static and dynamic 3D scenes demonstrates strict multi-view consistency, real-time rendering performance (>30 FPS), and superior visual quality compared to existing methods reliant on per-style optimization.

Technology Category

Application Category

πŸ“ Abstract
We present FPGS, a feed-forward photorealistic style transfer method of large-scale radiance fields represented by Gaussian Splatting. FPGS, stylizes large-scale 3D scenes with arbitrary, multiple style reference images without additional optimization while preserving multi-view consistency and real-time rendering speed of 3D Gaussians. Prior arts required tedious per-style optimization or time-consuming per-scene training stage and were limited to small-scale 3D scenes. FPGS efficiently stylizes large-scale 3D scenes by introducing a style-decomposed 3D feature field, which inherits AdaIN's feed-forward stylization machinery, supporting arbitrary style reference images. Furthermore, FPGS supports multi-reference stylization with the semantic correspondence matching and local AdaIN, which adds diverse user control for 3D scene styles. FPGS also preserves multi-view consistency by applying semantic matching and style transfer processes directly onto queried features in 3D space. In experiments, we demonstrate that FPGS achieves favorable photorealistic quality scene stylization for large-scale static and dynamic 3D scenes with diverse reference images. Project page: https://kim-geonu.github.io/FPGS/
Problem

Research questions and friction points this paper is trying to address.

Stylizes large-scale 3D scenes without per-style optimization
Preserves multi-view consistency and real-time rendering speed
Supports arbitrary, multiple style references with semantic matching
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward photorealistic style transfer
Large-scale 3D scene stylization
Multi-view consistency and real-time rendering
πŸ”Ž Similar Papers
No similar papers found.