🤖 AI Summary
This work addresses the challenging task of synthesizing geometrically consistent and photorealistic 360° indoor panoramic images from 2D top-down floor plans. We propose a two-stage end-to-end framework: Stage I employs voxel occupancy field estimation coupled with volume rendering to generate structurally accurate coarse panoramas; Stage II refines these using a ControlNet-guided diffusion model for high-fidelity detail enhancement. To our knowledge, this is the first approach to deeply integrate volumetric reasoning with generative diffusion modeling for structure-aware panoramic synthesis. Our method significantly outperforms existing baselines across multiple standard benchmarks, achieving superior geometric fidelity, accurate occlusion modeling, and faithful spatial layout reconstruction. Moreover, it demonstrates strong generalization to real-world hand-drawn floor plans, validating its robustness and practical applicability.
📝 Abstract
Generating immersive 360° indoor panoramas from 2D top-down views has applications in virtual reality, interior design, real estate, and robotics. This task is challenging due to the lack of explicit 3D structure and the need for geometric consistency and photorealism. We propose Top2Pano, an end-to-end model for synthesizing realistic indoor panoramas from top-down views. Our method estimates volumetric occupancy to infer 3D structures, then uses volumetric rendering to generate coarse color and depth panoramas. These guide a diffusion-based refinement stage using ControlNet, enhancing realism and structural fidelity. Evaluations on two datasets show Top2Pano outperforms baselines, effectively reconstructing geometry, occlusions, and spatial arrangements. It also generalizes well, producing high-quality panoramas from schematic floorplans. Our results highlight Top2Pano's potential in bridging top-down views with immersive indoor synthesis.