🤖 AI Summary
Existing methods for converting 2D hand-drawn sketches into 3D models are hindered by either brittle symbolic reasoning or rigid parametric modeling, struggling to balance creative freedom with structural fidelity. This work reframes the task as a conditional dense depth estimation problem and introduces a generative framework based on Latent Diffusion Models (LDMs) with ControlNet-style conditioning. To simulate partial depth cues, the approach incorporates a graph-structured BFS masking strategy, enabling iterative sketch-to-reconstruction interaction. Trained on image-depth pairs derived from the million-scale ABC dataset, the method transcends conventional CAD primitive constraints and demonstrates robust performance on complex shapes, achieving scalable “3D drawing” capability that maps sparse line sketches to dense 3D representations.
📝 Abstract
The conversion of 2D freehand sketches into 3D models remains a pivotal challenge in computer vision, bridging the gap between human creativity and digital fabrication. Traditional line drawing reconstruction relies on brittle symbolic logic, while modern approaches are constrained by rigid parametric modeling, limiting users to predefined CAD primitives. We propose a generative approach by framing reconstruction as a conditional dense depth estimation task. To achieve this, we implement a Latent Diffusion Model (LDM) with a ControlNet-style conditioning framework to resolve the inherent ambiguities of orthographic projections. To support an iterative "sketch-reconstruct-sketch" workflow, we introduce a graph-based BFS masking strategy to simulate partial depth cues. We train and evaluate our approach using a massive dataset of over one million image-depth pairs derived from the ABC Dataset. Our framework demonstrates robust performance across varying shape complexities, providing a scalable pipeline for converting sparse 2D line drawings into dense 3D representations, effectively allowing users to "draw in 3D" without the rigid constraints of traditional CAD.