🤖 AI Summary
Existing panoramic image synthesis methods are constrained to low resolutions (512×1024), struggling to simultaneously achieve high resolution, memory efficiency, and real-time inference. To address this, we propose the first efficient feed-forward framework for 4K spherical panoramas (2048×4096). Our method introduces a novel spherical 3D Gaussian pyramid combined with Fibonacci lattice sampling to construct a hierarchical spherical cost volume, coupled with a localized Gaussian rendering head. We further adopt a deferred backpropagation strategy enabling two-stage training on a single A100 GPU. Evaluated on both synthetic and real-world datasets, our approach achieves state-of-the-art visual quality while reducing training memory consumption by 67% and accelerating inference by 3.2×. It supports real-time rendering under wide-baseline inputs, making it suitable for applications including VR, virtual touring, and autonomous driving.
📝 Abstract
With the advent of portable 360{deg} cameras, panorama has gained significant attention in applications like virtual reality (VR), virtual tours, robotics, and autonomous driving. As a result, wide-baseline panorama view synthesis has emerged as a vital task, where high resolution, fast inference, and memory efficiency are essential. Nevertheless, existing methods are typically constrained to lower resolutions (512 $ imes$ 1024) due to demanding memory and computational requirements. In this paper, we present PanSplat, a generalizable, feed-forward approach that efficiently supports resolution up to 4K (2048 $ imes$ 4096). Our approach features a tailored spherical 3D Gaussian pyramid with a Fibonacci lattice arrangement, enhancing image quality while reducing information redundancy. To accommodate the demands of high resolution, we propose a pipeline that integrates a hierarchical spherical cost volume and Gaussian heads with local operations, enabling two-step deferred backpropagation for memory-efficient training on a single A100 GPU. Experiments demonstrate that PanSplat achieves state-of-the-art results with superior efficiency and image quality across both synthetic and real-world datasets. Code is available at https://github.com/chengzhag/PanSplat.