Pro3D-Editor : A Progressive-Views Perspective for Consistent and Precise 3D Editing

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing text-guided 3D editing methods adopt a view-agnostic paradigm, neglecting inter-view semantic dependencies—leading to inconsistent local edits and semantic misalignment. This paper proposes a progressive view-aware editing paradigm: starting from a salient primary view, semantic edits are dynamically propagated to sparse auxiliary views to ensure spatial coherence and geometric fidelity. Our key contributions are: (1) Mixture-of-View-Experts LoRA (MoVE-LoRA), enabling precise cross-view semantic alignment; (2) a collaborative optimization framework integrating a Primary-view Sampler and a Full-view Refiner; and (3) an end-to-end differentiable framework unifying diffusion-based generation, differentiable rendering, and low-rank adaptation. Evaluated on multiple benchmarks, our method significantly improves editing accuracy and cross-view consistency. Both qualitative and quantitative results surpass state-of-the-art approaches, enabling high-fidelity, semantically complex, localized 3D editing.

Technology Category

Application Category

📝 Abstract

Text-guided 3D editing aims to precisely edit semantically relevant local 3D regions, which has significant potential for various practical applications ranging from 3D games to film production. Existing methods typically follow a view-indiscriminate paradigm: editing 2D views indiscriminately and projecting them back into 3D space. However, they overlook the different cross-view interdependencies, resulting in inconsistent multi-view editing. In this study, we argue that ideal consistent 3D editing can be achieved through a extit{progressive-views paradigm}, which propagates editing semantics from the editing-salient view to other editing-sparse views. Specifically, we propose extit{Pro3D-Editor}, a novel framework, which mainly includes Primary-view Sampler, Key-view Render, and Full-view Refiner. Primary-view Sampler dynamically samples and edits the most editing-salient view as the primary view. Key-view Render accurately propagates editing semantics from the primary view to other key views through its Mixture-of-View-Experts Low-Rank Adaption (MoVE-LoRA). Full-view Refiner edits and refines the 3D object based on the edited multi-views. Extensive experiments demonstrate that our method outperforms existing methods in editing accuracy and spatial consistency.

Problem

Research questions and friction points this paper is trying to address.

Achieving consistent multi-view 3D editing via progressive-views paradigm

Overcoming cross-view interdependency issues in text-guided 3D editing

Enhancing editing accuracy and spatial consistency in 3D object manipulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive-views paradigm for consistent 3D editing

Primary-view Sampler dynamically selects editing-salient view

MoVE-LoRA propagates semantics to key views accurately

🔎 Similar Papers

No similar papers found.

Authors to Follow