MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address the generalization challenge of novel-view synthesis under sparse multi-baseline inputs (both small and large baselines), this paper proposes a unified feed-forward framework. It jointly leverages multi-view stereo (MVS) and monocular depth estimation (MDE) features, designs a deeply integrated projection-sampling mechanism, and constructs a geometry-aware probabilistic depth volume to guide fine-grained regression of 3D Gaussian splats. A reference-view supervision loss is introduced to explicitly enforce geometric consistency and improve training efficiency. The method achieves state-of-the-art performance on DTU and RealEstate10K, and demonstrates strong zero-shot generalization on LLFF and Mip-NeRF 360, significantly reducing both training and rendering overhead. Key innovations include cross-baseline feature co-modeling, probabilistic volume-guided 3D Gaussian regression, and a lightweight, efficient reference-view supervision paradigm.

Technology Category

Application Category

📝 Abstract

We present Multi-Baseline Gaussian Splatting (MuRF), a generalized feed-forward approach for novel view synthesis that effectively handles diverse baseline settings, including sparse input views with both small and large baselines. Specifically, we integrate features from Multi-View Stereo (MVS) and Monocular Depth Estimation (MDE) to enhance feature representations for generalizable reconstruction. Next, We propose a projection-and-sampling mechanism for deep depth fusion, which constructs a fine probability volume to guide the regression of the feature map. Furthermore, We introduce a reference-view loss to improve geometry and optimization efficiency. We leverage 3D Gaussian representations to accelerate training and inference time while enhancing rendering quality. MuRF achieves state-of-the-art performance across multiple baseline settings and diverse scenarios ranging from simple objects (DTU) to complex indoor and outdoor scenes (RealEstate10K). We also demonstrate promising zero-shot performance on the LLFF and Mip-NeRF 360 datasets.

Problem

Research questions and friction points this paper is trying to address.

Generalizable novel view synthesis for diverse baseline settings

Enhancing feature representation with MVS and MDE integration

Improving geometry and efficiency via reference-view loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates MVS and MDE for enhanced feature representation

Uses projection-and-sampling for deep depth fusion

Leverages 3D Gaussian for faster training and rendering

🔎 Similar Papers

OmniRe: Omni Urban Scene Reconstruction