Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion

📅 2026-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for perspective video generation are constrained by limited fields of view, making it difficult to meet the demands of digital twins for precise camera control, global scene coverage, and spatiotemporal consistency. This work proposes a controllable generation framework based on 360° video diffusion, introducing explicit 3D geometric constraints into panoramic video synthesis for the first time. The approach reconstructs a sparse-input-derived 3D cache as a geometric scaffold, decoupling geometric consistency from texture generation, and guides a diffusion model to synthesize high-fidelity videos along arbitrary user-defined camera trajectories. The method significantly outperforms existing techniques in both visual quality and geometric coherence, enabling flexible and reliable 360° scene generation suitable for downstream applications such as simulation and digital twins.
📝 Abstract
Generating complete digital twins from videos requires precise camera control, global scene coverage, and strict spatial-temporal consistency constraints that remain challenging for perspective video generators due to their limited field of view (FoV). Their narrow FoV forces long or multi-view trajectories, amplifying cross-view inconsistency and temporal drift. We argue that 360° video generation offers a natural solution: panoramic coverage simplifies trajectory design and provides a strong global context for maintaining coherence. We introduce Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion, a controllable 360° video generation framework that synthesizes high-fidelity videos from sparse 360° inputs. The key idea is an explicit 3D Cache, reconstructed from the input, which serves as a geometric scaffold for any user-defined camera path. This allows the diffusion model to focus on photorealistic texture refinement while the 3D Cache enforces global geometric consistency. Experiments show that Pantheon360 achieves superior visual quality and unmatched geometric coherence, enabling reliable and flexible 360° scene generation for downstream simulation and digital-twin applications.
Problem

Research questions and friction points this paper is trying to address.

digital twin
360° video generation
spatial-temporal consistency
field of view
geometric coherence
Innovation

Methods, ideas, or system contributions that make the work stand out.

360° video generation
digital twin
3D-aware diffusion
3D cache
spatial-temporal consistency