Prox-E: Fine-Grained 3D Shape Editing via Primitive-Based Abstractions

📅 2026-04-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

193K/year
🤖 AI Summary
Existing 2D image-driven 3D editing methods struggle to achieve fine-grained local modifications while preserving the overall identity of the object. This work proposes a training-free 3D editing framework that introduces, for the first time, an explicit geometry-primitive-level abstraction mechanism: it decomposes the input 3D shape into a set of geometric primitives, leverages a pre-trained vision-language model to interpret editing instructions and perform semantic operations on these primitives, and subsequently guides a 3D generative model to realize high-fidelity local edits. The proposed method significantly outperforms existing 2D-driven and training-based 3D editing approaches in terms of identity preservation, shape quality, and instruction adherence.

Technology Category

Application Category

📝 Abstract
Text-based 2D image editing models have recently reached an impressive level of maturity, motivating a growing body of work that heavily depends on these models to drive 3D edits. While effective for appearance-based modifications, such 2D-centric 3D editing pipelines often struggle with fine-grained 3D editing, where localized structural changes must be applied while strictly preserving an object's overall identity. To address this limitation, we propose Prox-E, a training-free framework that enables fine-grained 3D control through an explicit, primitive-based geometric abstraction. Our framework first abstracts an input 3D shape into a compact set of geometric primitives. A pretrained vision-language model (VLM) then edits this abstraction to specify primitive-level changes. These structural edits are subsequently used to guide a 3D generative model, enabling fine-grained, localized modifications while preserving unchanged regions of the original shape. Through extensive experiments, we demonstrate that our method consistently balances identity preservation, shape quality, and instruction fidelity more effectively than various existing approaches, including 2D-based 3D editors and training-based methods.
Problem

Research questions and friction points this paper is trying to address.

fine-grained 3D editing
3D shape editing
identity preservation
localized structural changes
primitive-based abstraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

primitive-based abstraction
fine-grained 3D editing
training-free framework
vision-language model
3D generative model