GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing image generation methods struggle to precisely control 3D geometry, limiting their applicability in engineering design and creative production. This paper introduces GeoDrag—a training-free, geometry-prior-driven diffusion model editing framework. It leverages category-specific 3D models as geometric constraints and integrates keypoint-driven parametric modeling, rendering-consistent alignment, and lightweight style transfer to achieve viewpoint-consistent, high-fidelity geometric editing. GeoDrag supports interactive drag-based manipulation, significantly improving both the accuracy and efficiency of geometry-guided editing. Extensive evaluation on the DragBench benchmark demonstrates its cross-category generalization, high-fidelity reconstruction, and iterative editability. To our knowledge, GeoDrag is the first end-to-end, zero-shot solution enabling explicit geometric controllability in generative image editing—bridging a critical gap between generative modeling and practical design workflows.

Technology Category

Application Category

📝 Abstract

Precise geometric control in image generation is essential for engineering & product design and creative industries to control 3D object features accurately in image space. Traditional 3D editing approaches are time-consuming and demand specialized skills, while current image-based generative methods lack accuracy in geometric conditioning. To address these challenges, we propose GeoDiffusion, a training-free framework for accurate and efficient geometric conditioning of 3D features in image generation. GeoDiffusion employs a class-specific 3D object as a geometric prior to define keypoints and parametric correlations in 3D space. We ensure viewpoint consistency through a rendered image of a reference 3D object, followed by style transfer to meet user-defined appearance specifications. At the core of our framework is GeoDrag, improving accuracy and speed of drag-based image editing on geometry guidance tasks and general instructions on DragBench. Our results demonstrate that GeoDiffusion enables precise geometric modifications across various iterative design workflows.

Problem

Research questions and friction points this paper is trying to address.

Achieving precise geometric control in image generation for engineering and design

Addressing accuracy limitations in current geometric conditioning methods for images

Overcoming time-consuming traditional 3D editing requiring specialized skills

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for 3D geometric conditioning

Uses 3D object as geometric prior with keypoints

Ensures viewpoint consistency through rendered reference images

🔎 Similar Papers

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation