Velocity-Space 3D Asset Editing

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Existing 3D local editing methods often suffer from identity leakage, weakened edits, and global distortion due to processing edit signals outside the ODE sampler. This work proposes VS3D, a novel framework that achieves high-fidelity local editing without inversion, training, or masks by performing a three-stage collaborative intervention directly within the velocity field space during ODE sampling. Its core innovations include Reconstruction-Anchored Source Injection (RASI) to suppress identity leakage, Partial Mean Guidance (PMG) to amplify editing signals, and Two-fold Aligned Residual injection (TAR) for per-token preservation decisions. By operating entirely inside the ODE solver, VS3D overcomes the limitations of external constraints, enabling precise control over target-region geometry and appearance while preserving the integrity of non-edited regions.

📝 Abstract

Editing a 3D asset locally, modifying a target region while preserving the rest, is a fundamental requirement of native 3D editing. Existing methods enforce locality through mechanisms external to the generator, such as manual 3D masks, post-hoc voxel merging, or 2D multi-view lifting. None of them intervene where the corruption actually originates: inside the ODE sampler. For a rectified-flow generator to achieve faithful local editing, its velocity field should be strong over the target editing region while vanishing on preserved content. Yet a single velocity field can hardly satisfy both requirements simultaneously, leading to three problems: (i) identity leakage that keeps the edit signal non-zero on preserved regions; (ii) no dedicated edit-amplification channel, so strengthening the edit inevitably perturbs identity; and (iii) an identity drag at the geometry and material stages, where a global condition pulls every token toward the target. We propose VS3D (Velocity-Space 3D Asset editing}), an inversion-free, training-free, and mask-free framework that addresses each problem with a targeted intervention inside the sampler. VS3D integrates three complementary modules, each corresponding to a specific stage of the editing pipeline. Reconstruction-Anchored Source Injection (RASI) absorbs identity leakage by turning the unconditional embedding into a per-step, asset-specific anchor calibrated through source reconstruction. Partial-Mean Guidance (PMG) amplifies the edit signal by contrasting high- and low-quality subsample estimates of the velocity difference, active only where a consistent edit exists. Twin-Agreement Residual injection (TAR) lets the sampler decide token by token what to preserve at the geometry and material stages.

Problem

Research questions and friction points this paper is trying to address.

local 3D editing

identity leakage

edit-amplification

velocity field

rectified-flow generator

Innovation

Methods, ideas, or system contributions that make the work stand out.

velocity-space editing

local 3D editing

rectified flow