DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation

📅 2024-07-16

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing text-driven 3D editing methods rely on Score Distillation Sampling (SDS), but suffer from fundamental inefficiency and poor generation quality due to an inherent mismatch between SDS and the true sampling dynamics of diffusion models. This work is the first to theoretically characterize this conflict and reformulates SDS as a differentiable modeling framework aligned with the diffusion reverse process. We propose a dual-mode optimization framework that enables controllable editing strength while preserving geometric and textural identity. Our method is natively compatible with both Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) without architectural modifications. Experiments demonstrate state-of-the-art performance: training speed improves by 23× (fast mode) or 8× (high-fidelity mode) over prior arts, with superior reconstruction fidelity and editing quality across both major 3D representations.

Technology Category

Application Category

📝 Abstract

Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks, leveraging diffusion models for 3D-consistent editing. However, existing SDS-based 3D editing methods suffer from long training times and produce low-quality results. We identify that the root cause of this performance degradation is extit{their conflict with the sampling dynamics of diffusion models}. Addressing this conflict allows us to treat SDS as a diffusion reverse process for 3D editing via sampling from data space. In contrast, existing methods naively distill the score function using diffusion models. From these insights, we propose DreamCatalyst, a novel framework that considers these sampling dynamics in the SDS framework. Specifically, we devise the optimization process of our DreamCatalyst to approximate the diffusion reverse process in editing tasks, thereby aligning with diffusion sampling dynamics. As a result, DreamCatalyst successfully reduces training time and improves editing quality. Our method offers two modes: (1) a fast mode that edits Neural Radiance Fields (NeRF) scenes approximately 23 times faster than current state-of-the-art NeRF editing methods, and (2) a high-quality mode that produces superior results about 8 times faster than these methods. Notably, our high-quality mode outperforms current state-of-the-art NeRF editing methods in terms of both speed and quality. DreamCatalyst also surpasses the state-of-the-art 3D Gaussian Splatting (3DGS) editing methods, establishing itself as an effective and model-agnostic 3D editing solution. See more extensive results on our project page: https://dream-catalyst.github.io.

Problem

Research questions and friction points this paper is trying to address.

Reduces training time in 3D editing tasks.

Improves quality of 3D editing results.

Aligns with diffusion model sampling dynamics.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Score distillation sampling optimization

Neural Radiance Fields editing

Diffusion reverse process approximation

🔎 Similar Papers

GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians