🤖 AI Summary
To address overfitting and poor surface quality in Gaussian splatting reconstruction from sparse, uncalibrated images, this paper proposes an end-to-end framework jointly optimizing dense Gaussian initialization and camera parameters. Methodologically, it leverages a large-scale Transformer to encode multi-view features, eliminating reliance on densely calibrated views. Key contributions include: (1) a self-splitting Gaussian head that generates geometrically consistent initial scenes; (2) differentiable camera parameter optimization supervised jointly by depth and multi-view feature consistency; and (3) a contribution-aware dynamic pruning strategy to mitigate overfitting. Evaluated on DTU and Replica benchmarks, the method significantly outperforms state-of-the-art approaches, achieving high-fidelity surface reconstruction and novel-view synthesis from freely captured sparse imagery in just three minutes.
📝 Abstract
Gaussian Splatting has become a leading reconstruction technique, known for its high-quality novel view synthesis and detailed reconstruction. However, most existing methods require dense, calibrated views. Reconstructing from free sparse images often leads to poor surface due to limited overlap and overfitting. We introduce FSFSplatter, a new approach for fast surface reconstruction from free sparse images. Our method integrates end-to-end dense Gaussian initialization, camera parameter estimation, and geometry-enhanced scene optimization. Specifically, FSFSplatter employs a large Transformer to encode multi-view images and generates a dense and geometrically consistent Gaussian scene initialization via a self-splitting Gaussian head. It eliminates local floaters through contribution-based pruning and mitigates overfitting to limited views by leveraging depth and multi-view feature supervision with differentiable camera parameters during rapid optimization. FSFSplatter outperforms current state-of-the-art methods on widely used DTU and Replica.