🤖 AI Summary
This work addresses the problem of localized 3D mesh editing guided by a single edited image. We propose an interactive editing framework based on mask-conditioned reconstruction: user-specified 3D regions serve as geometric masks, and the edited image acts as a conditioning signal to guide a Large Reconstruction Model (LRM) to reconstruct only the masked regions while preserving high fidelity in the unmasked areas. To our knowledge, this is the first method to adapt LRM for real-time, mask-conditioned mesh editing. Our approach integrates multi-view-consistent mask rendering, stochastic 3D occlusion synthesis, and single-view conditional injection, enabling high-quality geometric updates in a single forward pass. The framework supports diverse semantic edits—including deformation, part replacement, and detail sculpting—achieving state-of-the-art reconstruction quality while accelerating inference by 10× over the best prior baseline.
📝 Abstract
We present a novel approach to mesh shape editing, building on recent progress in 3D reconstruction from multi-view images. We formulate shape editing as a conditional reconstruction problem, where the model must reconstruct the input shape with the exception of a specified 3D region, in which the geometry should be generated from the conditional signal. To this end, we train a conditional Large Reconstruction Model (LRM) for masked reconstruction, using multi-view consistent masks rendered from a randomly generated 3D occlusion, and using one clean viewpoint as the conditional signal. During inference, we manually define a 3D region to edit and provide an edited image from a canonical viewpoint to fill in that region. We demonstrate that, in just a single forward pass, our method not only preserves the input geometry in the unmasked region through reconstruction capabilities on par with SoTA, but is also expressive enough to perform a variety of mesh edits from a single image guidance that past works struggle with, while being 10x faster than the top-performing competing prior work.