3D-LATTE: Latent Space 3D Editing from Textual Instructions

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing text-driven 3D editing methods rely on 2D priors, leading to view inconsistency and geometric distortion. To address this, we propose the first training-free, native 3D latent-space editing framework: it directly manipulates geometry within the latent space of a multi-view 3D diffusion model. Our method integrates 3D attention maps to guide spatially coherent edits, introduces geometry-aware regularization and Fourier-domain spectral modulation to preserve structural fidelity, and employs 3D-augmented refinement for robust reconstruction. Crucially, it enables text-instructed editing of 3D geometric latent variables—previously unattainable. Experiments demonstrate high-fidelity, view-consistent results across shape deformation, part replacement, and semantic redrawing tasks, significantly outperforming state-of-the-art 3D editing approaches in both geometric accuracy and multi-view consistency.

Technology Category

Application Category

📝 Abstract

Despite the recent success of multi-view diffusion models for text/image-based 3D asset generation, instruction-based editing of 3D assets lacks surprisingly far behind the quality of generation models. The main reason is that recent approaches using 2D priors suffer from view-inconsistent editing signals. Going beyond 2D prior distillation methods and multi-view editing strategies, we propose a training-free editing method that operates within the latent space of a native 3D diffusion model, allowing us to directly manipulate 3D geometry. We guide the edit synthesis by blending 3D attention maps from the generation with the source object. Coupled with geometry-aware regularization guidance, a spectral modulation strategy in the Fourier domain and a refinement step for 3D enhancement, our method outperforms previous 3D editing methods enabling high-fidelity, precise, and robust edits across a wide range of shapes and semantic manipulations.

Problem

Research questions and friction points this paper is trying to address.

Enabling view-consistent 3D editing from textual instructions

Overcoming limitations of 2D prior distillation methods

Directly manipulating 3D geometry in latent space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free latent space 3D editing

3D attention map blending technique

Geometry-aware spectral modulation refinement

🔎 Similar Papers

No similar papers found.