🤖 AI Summary
This work addresses the challenge of rapidly generating geometrically consistent, high-fidelity 3D scenes from a single panoramic image for robotic simulation. The authors propose a feedforward Gaussian splatting approach that decomposes the input panorama into six cubemap faces processed in parallel. By incorporating a training-free monocular depth injection mechanism and a depth-aware fusion strategy, the method achieves cross-view geometric consistency without requiring multi-view inputs or iterative optimization. The resulting pipeline can synthesize photorealistic 3D environments in seconds and has been integrated into the Genie Sim platform, offering an efficient and scalable simulation framework for embodied intelligence tasks.
📝 Abstract
We present Genie Sim PanoRecon, a feed-forward Gaussian-splatting pipeline that delivers high-fidelity, low-cost 3D scenes for robotic manipulation simulation. The panorama input is decomposed into six non-overlapping cube-map faces, processed in parallel, and seamlessly reassembled. To guarantee geometric consistency across views, we devise a depth-aware fusion strategy coupled with a training-free depth-injection module that steers the monocular feed-forward network to generate coherent 3D Gaussians. The whole system reconstructs photo-realistic scenes in seconds and has been integrated into Genie Sim - a LLM-driven simulation platform for embodied synthetic data generation and evaluation - to provide scalable backgrounds for manipulation tasks. For code details, please refer to: https://github.com/AgibotTech/genie_sim/tree/main/source/geniesim_world.