TRACE: High-Fidelity 3D Scene Editing via Tangible Reconstruction and Geometry-Aligned Contextual Video Masking

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of achieving fine-grained, part-level, high-fidelity editing in 3D scenes while preserving structural integrity. The authors propose a mesh-guided 3D Gaussian Splatting (3DGS) editing framework that aligns video diffusion models with explicit 3D geometry to enable automated, high-fidelity manipulation. Key contributions include the introduction of MV-TRACE—the first dataset supporting multi-view consistency—alongside novel mechanisms: Touchable Geometry Anchoring (TGA) and Contextual Video Masking (CVM). The method employs a three-stage pipeline integrating 3D reconstruction, two-stage registration, and autoregressive video generation. Experiments demonstrate that the approach significantly outperforms existing methods in both editing flexibility and structural coherence, producing temporally consistent and physically plausible 3D scene edits with high fidelity.
📝 Abstract
We present TRACE, a mesh-guided 3DGS editing framework that achieves automated, high-fidelity scene transformation. By anchoring video diffusion with explicit 3D geometry, TRACE uniquely enables fine-grained, part-level manipulatio--such as local pose shifting or component replacemen--while preserving the structural integrity of the central subject, a capability largely absent in existing editing methods. Our approach comprises three key stages: (1) Multi-view 3D-Anchor Synthesis, which leverages a sparse-view editor trained on our MV-TRACE datase--the first multi-view consistent dataset dedicated to scene-coherent object addition and modificatio--to generate spatially consistent 3D-anchors; (2) Tangible Geometry Anchoring (TGA), which ensures precise spatial synchronization between inserted meshes and the 3DGS scene via two-phase registration; and (3) Contextual Video Masking (CVM), which integrates 3D projections into an autoregressive video pipeline to achieve temporally stable, physically-grounded rendering. Extensive experiments demonstrate that TRACE consistently outperforms existing methods especially in editing versatility and structural integrity.
Problem

Research questions and friction points this paper is trying to address.

3D scene editing
structural integrity
part-level manipulation
high-fidelity editing
geometry consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian Splatting
Geometry-Aligned Editing
Multi-view Consistency
Tangible Geometry Anchoring
Contextual Video Masking
🔎 Similar Papers
No similar papers found.