SparSplat: Fast Multi-View Reconstruction with Generalizable 2D Gaussian Splatting

📅 2025-05-04
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of real-time multi-view stereo (MVS) reconstruction and novel view synthesis (NVS) under sparse-view settings, this paper introduces the first end-to-end feedforward 2D Gaussian splatting framework. The method directly regresses generalizable 2D Gaussian parameters, jointly optimizing geometric reconstruction accuracy and rendering quality. To enhance precision, speed, and cross-dataset generalization, it incorporates multi-view feature distillation and explicit MVS supervision, integrating pre-trained visual features. Experimental results demonstrate state-of-the-art performance on DTU (Chamfer distance), significant improvements over existing methods on BlendedMVS and Tanks and Temples, and inference speed approximately 100× faster than implicit volumetric rendering. The core contribution lies in the first differentiable, generalizable, and real-time feedforward 2D Gaussian splatting reconstruction pipeline tailored for sparse-view scenarios.

Technology Category

Application Category

📝 Abstract
Recovering 3D information from scenes via multi-view stereo reconstruction (MVS) and novel view synthesis (NVS) is inherently challenging, particularly in scenarios involving sparse-view setups. The advent of 3D Gaussian Splatting (3DGS) enabled real-time, photorealistic NVS. Following this, 2D Gaussian Splatting (2DGS) leveraged perspective accurate 2D Gaussian primitive rasterization to achieve accurate geometry representation during rendering, improving 3D scene reconstruction while maintaining real-time performance. Recent approaches have tackled the problem of sparse real-time NVS using 3DGS within a generalizable, MVS-based learning framework to regress 3D Gaussian parameters. Our work extends this line of research by addressing the challenge of generalizable sparse 3D reconstruction and NVS jointly, and manages to perform successfully at both tasks. We propose an MVS-based learning pipeline that regresses 2DGS surface element parameters in a feed-forward fashion to perform 3D shape reconstruction and NVS from sparse-view images. We further show that our generalizable pipeline can benefit from preexisting foundational multi-view deep visual features. The resulting model attains the state-of-the-art results on the DTU sparse 3D reconstruction benchmark in terms of Chamfer distance to ground-truth, as-well as state-of-the-art NVS. It also demonstrates strong generalization on the BlendedMVS and Tanks and Temples datasets. We note that our model outperforms the prior state-of-the-art in feed-forward sparse view reconstruction based on volume rendering of implicit representations, while offering an almost 2 orders of magnitude higher inference speed.
Problem

Research questions and friction points this paper is trying to address.

Generalizable sparse 3D reconstruction and novel view synthesis
Fast multi-view reconstruction using 2D Gaussian splatting
Real-time performance with accurate geometry representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 2D Gaussian Splatting for accurate geometry
MVS-based pipeline regresses 2DGS parameters feed-forward
Leverages foundational multi-view deep visual features
🔎 Similar Papers
No similar papers found.
S
Shubhendu Jena
Inria, Univ. Rennes, CNRS, IRISA
Shishir Reddy Vutukur
Shishir Reddy Vutukur
TU Munich, Siemens AG
Object Pose Estimation6D PoseObject Pose Distribution
A
A. Boukhayma
Inria, Univ. Rennes, CNRS, IRISA