🤖 AI Summary
This work addresses the challenge of lowering the barrier to 3D scene authoring by reconciling generative AI with real-time, immersive editing—traditionally conflicting objectives. We propose a low-latency interactive editing framework for 3D radiance fields (NeRF/3D Gaussian Splatting) in VR, featuring: (1) a novel dynamic proxy geometry representation to mitigate generation latency; (2) a multi-granularity control architecture unifying natural language semantics and gesture-driven, geometry-level operations; and (3) lightweight generative model adaptation with multimodal input fusion. User studies demonstrate that hybrid (language + gesture) control improves editing efficiency by 42% and creative output quality by 31%, significantly enhancing iterative construction of complex scenes. Our core contribution is the first deep integration of generative AI into a closed-loop, real-time immersive 3D editing system, empirically validating the critical role of non-textual interfaces—particularly embodied gestures—in augmenting creative generation.
📝 Abstract
Authoring 3D scenes is a central task for spatial computing applications. Competing visions for lowering existing barriers are (1) focus on immersive, direct manipulation of 3D content or (2) leverage AI techniques that capture real scenes (3D Radiance Fields such as, NeRFs, 3D Gaussian Splatting) and modify them at a higher level of abstraction, at the cost of high latency. We unify the complementary strengths of these approaches and investigate how to integrate generative AI advances into real-time, immersive 3D Radiance Field editing. We introduce Dreamcrafter, a VR-based 3D scene editing system that: (1) provides a modular architecture to integrate generative AI algorithms; (2) combines different levels of control for creating objects, including natural language and direct manipulation; and (3) introduces proxy representations that support interaction during high-latency operations. We contribute empirical findings on control preferences and discuss how generative AI interfaces beyond text input enhance creativity in scene editing and world building.