No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses 3D Gaussian splatting reconstruction from sparse multi-view images without ground-truth camera pose supervision. We propose SPFSplat, a unified framework that jointly predicts canonical-space 3D Gaussian primitives and camera poses in a single forward pass via a shared-weight CNN backbone. To our knowledge, this is the first end-to-end self-supervised Gaussian splatting method operating without pose priors. We introduce a differentiable reprojection loss to enforce geometric consistency, jointly optimized with standard rendering losses. Our approach requires no pose priors, depth, or surface normal constraints, enabling robust reconstruction under large viewpoint variations. SPFSplat achieves state-of-the-art novel-view synthesis performance on multiple benchmarks. Moreover, it demonstrates superior accuracy in relative pose estimation—highlighting its capability to implicitly learn reliable geometric structure from appearance alone.

Technology Category

Application Category

📝 Abstract

We introduce SPFSplat, an efficient framework for 3D Gaussian splatting from sparse multi-view images, requiring no ground-truth poses during training or inference. It employs a shared feature extraction backbone, enabling simultaneous prediction of 3D Gaussian primitives and camera poses in a canonical space from unposed inputs within a single feed-forward step. Alongside the rendering loss based on estimated novel-view poses, a reprojection loss is integrated to enforce the learning of pixel-aligned Gaussian primitives for enhanced geometric constraints. This pose-free training paradigm and efficient one-step feed-forward design make SPFSplat well-suited for practical applications. Remarkably, despite the absence of pose supervision, SPFSplat achieves state-of-the-art performance in novel view synthesis even under significant viewpoint changes and limited image overlap. It also surpasses recent methods trained with geometry priors in relative pose estimation. Code and trained models are available on our project page: https://ranrhuang.github.io/spfsplat/.

Problem

Research questions and friction points this paper is trying to address.

3D Gaussian splatting from sparse multi-view images without ground-truth poses

Simultaneous prediction of 3D Gaussian primitives and camera poses in canonical space

Achieving state-of-the-art novel view synthesis without pose supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised 3D Gaussian splatting without poses

Single feed-forward step predicts primitives and poses

Reprojection loss enhances geometric constraints

🔎 Similar Papers

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction