LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

📅 2024-06-23

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the challenges of low-accuracy interactive motion reconstruction and difficult multi-object coordinated control in complex physical scenes. We propose a language-driven Interactive Radiance Field framework. Methodologically, we introduce a novel scene-level language embedding mechanism, integrating local deformable field decomposition with interaction-aware language-object alignment for precise spatial localization—enabling motion-decoupled modeling and fine-grained, natural-language-instruction-driven control. Built upon NeRF optimization, our approach is evaluated on the OmniSim and InterReal datasets. Results show a 2.1 dB PSNR improvement in novel-view synthesis, an 18.7% gain in cross-modal grounding accuracy, a 39% reduction in GPU memory consumption, and a 42.5% decrease in dynamic reconstruction error—substantially outperforming existing interactive reconstruction and language-controlled methods.

Technology Category

Application Category

📝 Abstract

This paper scales object-level reconstruction to complex scenes, advancing interactive scene reconstruction. We introduce two datasets, OmniSim and InterReal, featuring 28 scenes with multiple interactive objects. To tackle the challenge of inaccurate interactive motion recovery in complex scenes, we propose LiveScene, a scene-level language-embedded interactive radiance field that efficiently reconstructs and controls multiple objects. By decomposing the interactive scene into local deformable fields, LiveScene enables separate reconstruction of individual object motions, reducing memory consumption. Additionally, our interaction-aware language embedding localizes individual interactive objects, allowing for arbitrary control using natural language. Our approach demonstrates significant superiority in novel view synthesis, interactive scene control, and language grounding performance through extensive experiments. Project page: https://livescenes.github.io.

Problem

Research questions and friction points this paper is trying to address.

Scales object-level reconstruction to complex scenes

Tackles inaccurate interactive motion recovery in complex scenes

Enables arbitrary control of interactive objects using natural language

Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-embedded interactive radiance fields for scene control

Decomposes scenes into local deformable fields for efficiency

Interaction-aware language embedding enables natural language control

🔎 Similar Papers

Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space