AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses feed-forward novel view synthesis from multi-view image collections without prior camera pose information. We propose the first end-to-end, zero-pose-supervision framework that jointly estimates scene geometry, appearance, and per-image intrinsic and extrinsic camera parameters in a single forward pass, producing a 3D Gaussian splatting field suitable for real-time rendering. Our method employs a deep neural network to directly map input images to Gaussian parameters and camera parameters, incorporating an implicit camera calibration module that eliminates conventional iterative optimization. In zero-shot evaluation, our approach matches the performance of pose-aware methods and significantly outperforms existing pose-free approaches under both sparse and dense view settings. Rendering latency is reduced by an order of magnitude, enabling near-real-time novel view synthesis for the first time under completely unconstrained capture conditions.

Technology Category

Application Category

📝 Abstract
We introduce AnySplat, a feed forward network for novel view synthesis from uncalibrated image collections. In contrast to traditional neural rendering pipelines that demand known camera poses and per scene optimization, or recent feed forward methods that buckle under the computational weight of dense views, our model predicts everything in one shot. A single forward pass yields a set of 3D Gaussian primitives encoding both scene geometry and appearance, and the corresponding camera intrinsics and extrinsics for each input image. This unified design scales effortlessly to casually captured, multi view datasets without any pose annotations. In extensive zero shot evaluations, AnySplat matches the quality of pose aware baselines in both sparse and dense view scenarios while surpassing existing pose free approaches. Moreover, it greatly reduce rendering latency compared to optimization based neural fields, bringing real time novel view synthesis within reach for unconstrained capture settings.Project page: https://city-super.github.io/anysplat/
Problem

Research questions and friction points this paper is trying to address.

Novel view synthesis from uncalibrated image collections
Predicting 3D Gaussian primitives and camera poses in one forward pass
Real-time rendering for unconstrained capture settings without pose annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward network for novel view synthesis
Predicts 3D Gaussian primitives in one shot
Unified design scales without pose annotations
🔎 Similar Papers
No similar papers found.
Lihan Jiang
Lihan Jiang
USTC, Shanghai AI Laboratory
neural rendering3d reconstruction
Yucheng Mao
Yucheng Mao
UC San Diego
3D Computer Vision
L
Linning Xu
The Chinese University of Hong Kong
T
Tao Lu
Brown University
Kerui Ren
Kerui Ren
Shanghai Jiao Tong University, Shanghai AI Laboratory
3D ReconstructionNeural Rendering
Y
Yichen Jin
Shanghai Artificial Intelligence Laboratory
X
Xudong Xu
Shanghai Artificial Intelligence Laboratory
Mulin Yu
Mulin Yu
Shanghai AILab; INRIA
3D reconstruction and 3D repairing
J
Jiangmiao Pang
Shanghai Artificial Intelligence Laboratory
F
Feng Zhao
The University of Science and Technology of China
Dahua Lin
Dahua Lin
The Chinese University of Hong Kong
computer visionmachine learningprobabilistic inferencebayesian nonparametrics
B
Bo Dai
The University of Hong Kong