MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the generalization challenge of novel-view synthesis under sparse multi-baseline inputs (both small and large baselines), this paper proposes a unified feed-forward framework. It jointly leverages multi-view stereo (MVS) and monocular depth estimation (MDE) features, designs a deeply integrated projection-sampling mechanism, and constructs a geometry-aware probabilistic depth volume to guide fine-grained regression of 3D Gaussian splats. A reference-view supervision loss is introduced to explicitly enforce geometric consistency and improve training efficiency. The method achieves state-of-the-art performance on DTU and RealEstate10K, and demonstrates strong zero-shot generalization on LLFF and Mip-NeRF 360, significantly reducing both training and rendering overhead. Key innovations include cross-baseline feature co-modeling, probabilistic volume-guided 3D Gaussian regression, and a lightweight, efficient reference-view supervision paradigm.

Technology Category

Application Category

📝 Abstract
We present Multi-Baseline Gaussian Splatting (MuRF), a generalized feed-forward approach for novel view synthesis that effectively handles diverse baseline settings, including sparse input views with both small and large baselines. Specifically, we integrate features from Multi-View Stereo (MVS) and Monocular Depth Estimation (MDE) to enhance feature representations for generalizable reconstruction. Next, We propose a projection-and-sampling mechanism for deep depth fusion, which constructs a fine probability volume to guide the regression of the feature map. Furthermore, We introduce a reference-view loss to improve geometry and optimization efficiency. We leverage 3D Gaussian representations to accelerate training and inference time while enhancing rendering quality. MuRF achieves state-of-the-art performance across multiple baseline settings and diverse scenarios ranging from simple objects (DTU) to complex indoor and outdoor scenes (RealEstate10K). We also demonstrate promising zero-shot performance on the LLFF and Mip-NeRF 360 datasets.
Problem

Research questions and friction points this paper is trying to address.

Generalizable novel view synthesis for diverse baseline settings
Enhancing feature representation with MVS and MDE integration
Improving geometry and efficiency via reference-view loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates MVS and MDE for enhanced feature representation
Uses projection-and-sampling for deep depth fusion
Leverages 3D Gaussian for faster training and rendering
🔎 Similar Papers
No similar papers found.
Y
Yaopeng Lou
School of AIA, Huazhong University of Science and Technology
Liao Shen
Liao Shen
Huazhong University of Science and Technology
computer vision
T
Tianqi Liu
School of AIA, Huazhong University of Science and Technology
J
Jiaqi Li
School of AIA, Huazhong University of Science and Technology
Z
Zihao Huang
School of AIA, Huazhong University of Science and Technology
H
Huiqiang Sun
School of AIA, Huazhong University of Science and Technology
Zhiguo Cao
Zhiguo Cao
Huazhong University of Science and Technology
Pattern RecognitionComputer Vision