PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

📅 2024-12-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing panoramic image synthesis methods are constrained to low resolutions (512×1024), struggling to simultaneously achieve high resolution, memory efficiency, and real-time inference. To address this, we propose the first efficient feed-forward framework for 4K spherical panoramas (2048×4096). Our method introduces a novel spherical 3D Gaussian pyramid combined with Fibonacci lattice sampling to construct a hierarchical spherical cost volume, coupled with a localized Gaussian rendering head. We further adopt a deferred backpropagation strategy enabling two-stage training on a single A100 GPU. Evaluated on both synthetic and real-world datasets, our approach achieves state-of-the-art visual quality while reducing training memory consumption by 67% and accelerating inference by 3.2×. It supports real-time rendering under wide-baseline inputs, making it suitable for applications including VR, virtual touring, and autonomous driving.

Technology Category

Application Category

📝 Abstract
With the advent of portable 360{deg} cameras, panorama has gained significant attention in applications like virtual reality (VR), virtual tours, robotics, and autonomous driving. As a result, wide-baseline panorama view synthesis has emerged as a vital task, where high resolution, fast inference, and memory efficiency are essential. Nevertheless, existing methods are typically constrained to lower resolutions (512 $ imes$ 1024) due to demanding memory and computational requirements. In this paper, we present PanSplat, a generalizable, feed-forward approach that efficiently supports resolution up to 4K (2048 $ imes$ 4096). Our approach features a tailored spherical 3D Gaussian pyramid with a Fibonacci lattice arrangement, enhancing image quality while reducing information redundancy. To accommodate the demands of high resolution, we propose a pipeline that integrates a hierarchical spherical cost volume and Gaussian heads with local operations, enabling two-step deferred backpropagation for memory-efficient training on a single A100 GPU. Experiments demonstrate that PanSplat achieves state-of-the-art results with superior efficiency and image quality across both synthetic and real-world datasets. Code is available at https://github.com/chengzhag/PanSplat.
Problem

Research questions and friction points this paper is trying to address.

Synthesize high-resolution 4K panoramas efficiently
Overcome memory and computational constraints in panorama synthesis
Improve image quality with spherical 3D Gaussian pyramid
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward Gaussian splatting for 4K panoramas
Spherical 3D Gaussian pyramid with Fibonacci lattice
Hierarchical cost volume with deferred backpropagation
🔎 Similar Papers