C3Editor: Achieving Controllable Consistency in 2D Model for 3D Editing

📅 2025-10-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 2D-lifted 3D editing methods suffer from inconsistent multi-view outputs due to the lack of a view-consistent 2D editing model. This paper introduces C3Editor, the first framework to establish a view-consistent 2D editing model for controllable, cross-view consistent, text-driven and interactive 3D content editing. Its core contributions are: (1) a decoupled dual-LoRA architecture, separately optimizing photorealistic view reconstruction and multi-view consistency; (2) synergistic integration of fine-tuned 2D diffusion models, explicit multi-view consistency constraints, and a 2D-to-3D feature lifting strategy; and (3) support for user-guided manual editing. Extensive experiments demonstrate that C3Editor significantly outperforms state-of-the-art methods in both qualitative and quantitative evaluations, achieving substantial improvements in visual consistency across views and geometric fidelity.

Technology Category

Application Category

📝 Abstract
Existing 2D-lifting-based 3D editing methods often encounter challenges related to inconsistency, stemming from the lack of view-consistent 2D editing models and the difficulty of ensuring consistent editing across multiple views. To address these issues, we propose C3Editor, a controllable and consistent 2D-lifting-based 3D editing framework. Given an original 3D representation and a text-based editing prompt, our method selectively establishes a view-consistent 2D editing model to achieve superior 3D editing results. The process begins with the controlled selection of a ground truth (GT) view and its corresponding edited image as the optimization target, allowing for user-defined manual edits. Next, we fine-tune the 2D editing model within the GT view and across multiple views to align with the GT-edited image while ensuring multi-view consistency. To meet the distinct requirements of GT view fitting and multi-view consistency, we introduce separate LoRA modules for targeted fine-tuning. Our approach delivers more consistent and controllable 2D and 3D editing results than existing 2D-lifting-based methods, outperforming them in both qualitative and quantitative evaluations.
Problem

Research questions and friction points this paper is trying to address.

Achieving view-consistent 2D editing models for 3D generation
Ensuring consistent editing across multiple 3D views
Improving controllability in 2D-lifting-based 3D editing methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selects ground truth view for manual editing
Fine-tunes 2D model across multiple views
Uses separate LoRA modules for consistency
🔎 Similar Papers
2024-03-18European Conference on Computer VisionCitations: 16