3D-LATTE: Latent Space 3D Editing from Textual Instructions

📅 2025-08-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text-driven 3D editing methods rely on 2D priors, leading to view inconsistency and geometric distortion. To address this, we propose the first training-free, native 3D latent-space editing framework: it directly manipulates geometry within the latent space of a multi-view 3D diffusion model. Our method integrates 3D attention maps to guide spatially coherent edits, introduces geometry-aware regularization and Fourier-domain spectral modulation to preserve structural fidelity, and employs 3D-augmented refinement for robust reconstruction. Crucially, it enables text-instructed editing of 3D geometric latent variables—previously unattainable. Experiments demonstrate high-fidelity, view-consistent results across shape deformation, part replacement, and semantic redrawing tasks, significantly outperforming state-of-the-art 3D editing approaches in both geometric accuracy and multi-view consistency.

Technology Category

Application Category

📝 Abstract
Despite the recent success of multi-view diffusion models for text/image-based 3D asset generation, instruction-based editing of 3D assets lacks surprisingly far behind the quality of generation models. The main reason is that recent approaches using 2D priors suffer from view-inconsistent editing signals. Going beyond 2D prior distillation methods and multi-view editing strategies, we propose a training-free editing method that operates within the latent space of a native 3D diffusion model, allowing us to directly manipulate 3D geometry. We guide the edit synthesis by blending 3D attention maps from the generation with the source object. Coupled with geometry-aware regularization guidance, a spectral modulation strategy in the Fourier domain and a refinement step for 3D enhancement, our method outperforms previous 3D editing methods enabling high-fidelity, precise, and robust edits across a wide range of shapes and semantic manipulations.
Problem

Research questions and friction points this paper is trying to address.

Enabling view-consistent 3D editing from textual instructions
Overcoming limitations of 2D prior distillation methods
Directly manipulating 3D geometry in latent space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free latent space 3D editing
3D attention map blending technique
Geometry-aware spectral modulation refinement
🔎 Similar Papers
No similar papers found.