🤖 AI Summary
Existing methods for generative video synthesis and single-image-to-3D conversion often suffer from a lack of physical consistency, leading to issues such as object interpenetration, spatial misalignment, and stylization artifacts. This work proposes the first training-free, scene-level 3D generation framework that achieves physically consistent and interactive video synthesis through holistic geometric reconstruction within a unified spatial coordinate system. By decoupling physical simulation from rendering, the method supports complex multi-object mechanical interactions and real-time user control while preserving photorealistic visual fidelity. It substantially outperforms current approaches in terms of physical plausibility, spatial coherence, and user controllability.
📝 Abstract
Recent generative video models achieve impressive visual quality but remain constrained by limited physical consistency and controllability. Existing video generation methods provide minimal physical control, and single-image-to-3D conversion approaches often suffer from object interpenetration. Furthermore, physics-based scene-level 3D generation methods exhibit spatial misalignment, stylized artifacts, and inconsistencies with the input data, restricting their use in realistic interactive video synthesis. We propose TelePhysics, a training-free framework that converts a single image into a physically consistent and controllable video through holistic scene-level 3D reconstruction. By representing the full scene geometry in a unified spatial coordinate system, TelePhysics resolves object penetration and alignment ambiguity. Unlike prior methods, this formulation enables accurate scenelevel multi-object interactions and introduces richer, complex control types for advanced mechanicsbased manipulation. By decoupling simulation from rendering, TelePhysics bypasses latency-heavy priors, achieving real-time physical interaction previews paired while preserving photorealistic visual fidelity. Experimental results demonstrate that TelePhysics substantially outperforms prior methods in physical fidelity, spatial coherence, and controllability. The open-source code is available at https://github.com/xinzhang007/TelePhysics.