🤖 AI Summary
Existing image generation methods struggle to precisely control 3D geometry, limiting their applicability in engineering design and creative production. This paper introduces GeoDrag—a training-free, geometry-prior-driven diffusion model editing framework. It leverages category-specific 3D models as geometric constraints and integrates keypoint-driven parametric modeling, rendering-consistent alignment, and lightweight style transfer to achieve viewpoint-consistent, high-fidelity geometric editing. GeoDrag supports interactive drag-based manipulation, significantly improving both the accuracy and efficiency of geometry-guided editing. Extensive evaluation on the DragBench benchmark demonstrates its cross-category generalization, high-fidelity reconstruction, and iterative editability. To our knowledge, GeoDrag is the first end-to-end, zero-shot solution enabling explicit geometric controllability in generative image editing—bridging a critical gap between generative modeling and practical design workflows.
📝 Abstract
Precise geometric control in image generation is essential for engineering & product design and creative industries to control 3D object features accurately in image space. Traditional 3D editing approaches are time-consuming and demand specialized skills, while current image-based generative methods lack accuracy in geometric conditioning. To address these challenges, we propose GeoDiffusion, a training-free framework for accurate and efficient geometric conditioning of 3D features in image generation. GeoDiffusion employs a class-specific 3D object as a geometric prior to define keypoints and parametric correlations in 3D space. We ensure viewpoint consistency through a rendered image of a reference 3D object, followed by style transfer to meet user-defined appearance specifications. At the core of our framework is GeoDrag, improving accuracy and speed of drag-based image editing on geometry guidance tasks and general instructions on DragBench. Our results demonstrate that GeoDiffusion enables precise geometric modifications across various iterative design workflows.