Shape from Semantics: 3D Shape Generation from Multi-View Semantics

πŸ“… 2025-02-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of semantic-aware 3D reconstruction from vision inputs, where modeling semantic meaning and ensuring multi-view consistency remain difficult. We propose the first end-to-end β€œsemantic β†’ 3D shape” generation framework that takes multi-view semantic text prompts as input and jointly optimizes geometric fidelity, appearance consistency, and fine-grained detail. Key contributions include: (1) multi-semantic Score Distillation Sampling (SDS), enabling 3D implicit field optimization under cross-view semantic constraints; (2) a unified architecture integrating neural Signed Distance Functions (SDFs), 2D diffusion priors, image inpainting, and video generation modules; and (3) a structure-to-detail, multi-stage optimization pipeline. The resulting meshes exhibit high geometric accuracy, coherent texture mapping, smooth view transitions, and strong semantic interpretability. Experiments demonstrate significant improvements in expressiveness and controllability for semantic-driven 3D content creation.

Technology Category

Application Category

πŸ“ Abstract
We propose ``Shape from Semantics'', which is able to create 3D models whose geometry and appearance match given semantics when observed from different views. Traditional ``Shape from X'' tasks usually use visual input (e.g., RGB images or depth maps) to reconstruct geometry, imposing strict constraints that limit creative explorations. As applications, works like Shadow Art and Wire Art often struggle to grasp the embedded semantics of their design through direct observation and rely heavily on specific setups for proper display. To address these limitations, our framework uses semantics as input, greatly expanding the design space to create objects that integrate multiple semantic elements and are easily discernible by observers. Considering that this task requires a rich imagination, we adopt various generative models and structure-to-detail pipelines. Specifically, we adopt multi-semantics Score Distillation Sampling (SDS) to distill 3D geometry and appearance from 2D diffusion models, ensuring that the initial shape is consistent with the semantic input. We then use image restoration and video generation models to add more details as supervision. Finally, we introduce neural signed distance field (SDF) representation to achieve detailed shape reconstruction. Our framework generates meshes with complex details, well-structured geometry, coherent textures, and smooth transitions, resulting in visually appealing and eye-catching designs. Project page: https://shapefromsemantics.github.io
Problem

Research questions and friction points this paper is trying to address.

3D modeling
semantic integration
observational conformity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Meaning-to-Shape Generation
Neural Signed Distance Fields (SDF)
Multi-modal Meaning Integration
πŸ”Ž Similar Papers
No similar papers found.
L
Liangchen Li
University of Science and Technology of China, China
C
Caoliwen Wang
University of Science and Technology of China, China
Y
Yuqi Zhou
University of Science and Technology of China, China
Bailin Deng
Bailin Deng
Senior Lecturer, School of Computer Science and Informatics, Cardiff University
Computer Aided Geometric DesignDiscrete Differential GeometryArchitectural GeometryDigital Fabrication
Juyong Zhang
Juyong Zhang
University of Science and Technology of China
Computer Graphics3D VisionGeometry Processing