🤖 AI Summary
Existing style transfer methods for dynamic scenes—such as videos and interactive games—struggle with spatiotemporal inconsistency and rely on per-frame input and per-style optimization. To address this, we propose the first zero-shot stylization framework tailored for 3D dynamic scenes. Our core innovation lies in directly applying style transfer to learnable Gaussian feature vectors within 3D Gaussian Splatting representations—not to rendered 2D images—and integrating implicit temporal modeling to ensure cross-frame and multi-view style-consistent re-rendering. Given only a single style reference image, our method generalizes to unseen styles without fine-tuning. Evaluated on real-world dynamic scenes, it significantly outperforms state-of-the-art approaches, achieving superior style fidelity and inter-frame coherence. The framework is computationally efficient and suitable for real-time applications including gaming, film production, and extended reality (XR).
📝 Abstract
Stylizing a dynamic scene based on an exemplar image is critical for various real-world applications, including gaming, filmmaking, and augmented and virtual reality. However, achieving consistent stylization across both spatial and temporal dimensions remains a significant challenge. Most existing methods are designed for static scenes and often require an optimization process for each style image, limiting their adaptability. We introduce ZDySS, a zero-shot stylization framework for dynamic scenes, allowing our model to generalize to previously unseen style images at inference. Our approach employs Gaussian splatting for scene representation, linking each Gaussian to a learned feature vector that renders a feature map for any given view and timestamp. By applying style transfer on the learned feature vectors instead of the rendered feature map, we enhance spatio-temporal consistency across frames. Our method demonstrates superior performance and coherence over state-of-the-art baselines in tests on real-world dynamic scenes, making it a robust solution for practical applications.