π€ AI Summary
Existing drag-based image editing methods neglect global geometric structure, leading to deformation artifacts and inaccurate control-point alignment; moreover, the absence of real-world deformation annotations hinders objective evaluation. To address these issues, we propose Mesh-Guided Deformation Flow (MGDF): a method that constructs a differentiable 3D mesh to encode geometric priors, jointly optimizes energy terms and 2D projection mappings to generate structurally consistent deformation fields, and integrates a UNet-based denoiser to enhance fine-grained detail fidelity. Furthermore, we introduce VFDβthe first video benchmark featuring real 3D deformation annotations. Evaluated on VFD Bench and DragBench, MGDF achieves significant improvements in control-point accuracy (+12.6%) and structural stability (+9.3% PSNR), enabling precise, geometry-consistent, and highly controllable local editing.
π Abstract
Drag-based editing allows precise object manipulation through point-based control, offering user convenience. However, current methods often suffer from a geometric inconsistency problem by focusing exclusively on matching user-defined points, neglecting the broader geometry and leading to artifacts or unstable edits. We propose FlowDrag, which leverages geometric information for more accurate and coherent transformations. Our approach constructs a 3D mesh from the image, using an energy function to guide mesh deformation based on user-defined drag points. The resulting mesh displacements are projected into 2D and incorporated into a UNet denoising process, enabling precise handle-to-target point alignment while preserving structural integrity. Additionally, existing drag-editing benchmarks provide no ground truth, making it difficult to assess how accurately the edits match the intended transformations. To address this, we present VFD (VidFrameDrag) benchmark dataset, which provides ground-truth frames using consecutive shots in a video dataset. FlowDrag outperforms existing drag-based editing methods on both VFD Bench and DragBench.