🤖 AI Summary
This work addresses the geometric and topological inconsistencies that arise when generating structurally coherent 360° panoramic images from perspective views lacking pose information. To this end, the authors propose an intermediate representation in a canonical view space, leveraging a differentiable auto-leveling module to align input images and explicitly modeling the S¹ periodicity of equirectangular projection (ERP) panoramas in latent space to ensure seamless boundary continuity. Notably, the method operates without requiring camera parameters, enabling robust processing of in-the-wild images, and introduces—within a diffusion model framework—the first topology-equivariant latent representation for panoramic generation. Extensive experiments on Horizon360, a newly curated large-scale gravity-aligned panoramic dataset, demonstrate significant improvements in structural consistency and boundary continuity, achieving state-of-the-art performance in 360° scene completion.
📝 Abstract
Diffusion models excel at 2D outpainting, but extending them to $360^\circ$ panoramic completion from unposed perspective images is challenging due to the geometric and topological mismatch between perspective projections and spherical panoramas. We present Gimbal360, a principled framework that explicitly bridges perspective observations and spherical panoramas. We introduce a Canonical Viewing Space that regularizes projective geometry and provides a consistent intermediate representation between the two domains. To anchor in-the-wild inputs to this space, we propose a Differentiable Auto-Leveling module that stabilizes feature orientation without requiring camera parameters at inference. Panoramic generation also introduces a topological challenge. Standard generative architectures assume a bounded Euclidean image plane, while Equirectangular Projection (ERP) panoramas exhibit intrinsic $S^1$ periodicity. Euclidean operations therefore break boundary continuity. We address this mismatch by enforcing topological equivariance in the latent space to preserve seamless periodic structure. To support this formulation, we introduce Horizon360, a curated large-scale dataset of gravity-aligned panoramic environments. Extensive experiments show that explicitly standardizing geometric and topological priors enables Gimbal360 to achieve state-of-the-art performance in structurally consistent $360^\circ$ scene completion.