SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

📅 2024-11-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the severe degradation in rendering quality of 3D Gaussian Splatting (3DGS) when synthesizing out-of-distribution (OOD) novel views—i.e., viewpoints significantly distant from training poses. We propose the first point-cloud Transformer explicitly operating on the Gaussian point set, taking raw 3DGS outputs as input and performing end-to-end differentiable point-set refinement via a single forward pass—without requiring multi-scene joint training or auxiliary view supervision. Our core contribution is the first integration of a point Transformer architecture into 3DGS representation learning, enabling explicit modeling of geometric and appearance dependencies among Gaussians to enhance OOD generalization. Experiments demonstrate that our method substantially suppresses artifacts under extreme novel viewpoints and consistently outperforms leading 3DGS regularization techniques, sparse-view multi-scene models, and diffusion-augmented frameworks. It achieves state-of-the-art performance on OOD novel view synthesis.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) has recently transformed photorealistic reconstruction, achieving high visual fidelity and real-time performance. However, rendering quality significantly deteriorates when test views deviate from the camera angles used during training, posing a major challenge for applications in immersive free-viewpoint rendering and navigation. In this work, we conduct a comprehensive evaluation of 3DGS and related novel view synthesis methods under out-of-distribution (OOD) test camera scenarios. By creating diverse test cases with synthetic and real-world datasets, we demonstrate that most existing methods, including those incorporating various regularization techniques and data-driven priors, struggle to generalize effectively to OOD views. To address this limitation, we introduce SplatFormer, the first point transformer model specifically designed to operate on Gaussian splats. SplatFormer takes as input an initial 3DGS set optimized under limited training views and refines it in a single forward pass, effectively removing potential artifacts in OOD test views. To our knowledge, this is the first successful application of point transformers directly on 3DGS sets, surpassing the limitations of previous multi-scene training methods, which could handle only a restricted number of input views during inference. Our model significantly improves rendering quality under extreme novel views, achieving state-of-the-art performance in these challenging scenarios and outperforming various 3DGS regularization techniques, multi-scene models tailored for sparse view synthesis, and diffusion-based frameworks.
Problem

Research questions and friction points this paper is trying to address.

Improves 3DGS rendering quality for out-of-distribution camera views.
Addresses generalization issues in novel view synthesis methods.
Introduces SplatFormer for refining 3DGS sets in extreme novel views.
Innovation

Methods, ideas, or system contributions that make the work stand out.

SplatFormer: point transformer for 3D Gaussian splats
Refines 3DGS sets in single forward pass
Improves rendering quality for extreme novel views
🔎 Similar Papers
No similar papers found.