🤖 AI Summary
Existing feed-forward 3D Gaussian Splatting (3DGS) models are designed exclusively for perspective images and fail to accurately model the fisheye distortion and spherical geometry inherent in omnidirectional imagery, leading to inaccurate feature encoding and non-uniform Gaussian spatial distributions—ultimately degrading novel-view synthesis quality. This paper introduces the first zero-training, feed-forward 3DGS framework tailored for panoramic images. We propose a Yin-Yang grid decomposition scheme that explicitly encodes the panoramic-to-perspective domain mapping, thereby bridging spherical geometry with planar rasterization. Leveraging a pre-trained CNN encoder without fine-tuning, our method enables zero-shot transfer and scene-agnostic editing. Evaluated across multiple panoramic datasets, our approach significantly outperforms perspective-based baselines in reconstruction fidelity and novel-view synthesis quality while maintaining real-time rendering performance.
📝 Abstract
Feed-forward 3D Gaussian splatting (3DGS) models have gained significant popularity due to their ability to generate scenes immediately without needing per-scene optimization. Although omnidirectional images are becoming more popular since they reduce the computation required for image stitching to composite a holistic scene, existing feed-forward models are only designed for perspective images. The unique optical properties of omnidirectional images make it difficult for feature encoders to correctly understand the context of the image and make the Gaussian non-uniform in space, which hinders the image quality synthesized from novel views. We propose OmniSplat, a training-free fast feed-forward 3DGS generation framework for omnidirectional images. We adopt a Yin-Yang grid and decompose images based on it to reduce the domain gap between omnidirectional and perspective images. The Yin-Yang grid can use the existing CNN structure as it is, but its quasi-uniform characteristic allows the decomposed image to be similar to a perspective image, so it can exploit the strong prior knowledge of the learned feed-forward network. OmniSplat demonstrates higher reconstruction accuracy than existing feed-forward networks trained on perspective images. Our project page is available on: https://robot0321.github.io/omnisplat/index.html.