A 3D Generation Framework from Cross Modality to Parameterized Primitive

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address surface irregularity and excessive storage overhead in AI-driven 3D generation, this paper proposes a text-image cross-modal, multi-stage parametric primitive generation framework. Methodologically, it introduces differentiable parametric primitives—such as spheres, cylinders, and planes—as geometric building blocks, integrating feature-aware recognition with robust primitive fitting to achieve high geometric fidelity and C¹-smooth surfaces. A minimalist parameter encoding scheme is devised, storing only primitive type, pose, and scale—reducing representation to ~6 KB. Evaluated on both synthetic and real-world datasets, the method achieves state-of-the-art performance: Chamfer Distance = 0.003092, VIoU = 0.545, F1-Score = 0.9139, and Normal Consistency = 0.8369—outperforming leading implicit and mesh-based approaches. This work pioneers the systematic integration of parametric primitive modeling into cross-modal 3D generation, enabling simultaneously high-fidelity reconstruction and ultra-lightweight representation—ideal for real-time prototyping and edge deployment.

Technology Category

Application Category

📝 Abstract
Recent advancements in AI-driven 3D model generation have leveraged cross modality, yet generating models with smooth surfaces and minimizing storage overhead remain challenges. This paper introduces a novel multi-stage framework for generating 3D models composed of parameterized primitives, guided by textual and image inputs. In the framework, A model generation algorithm based on parameterized primitives, is proposed, which can identifies the shape features of the model constituent elements, and replace the elements with parameterized primitives with high quality surface. In addition, a corresponding model storage method is proposed, it can ensure the original surface quality of the model, while retaining only the parameters of parameterized primitives. Experiments on virtual scene dataset and real scene dataset demonstrate the effectiveness of our method, achieving a Chamfer Distance of 0.003092, a VIoU of 0.545, a F1-Score of 0.9139 and a NC of 0.8369, with primitive parameter files approximately 6KB in size. Our approach is particularly suitable for rapid prototyping of simple models.
Problem

Research questions and friction points this paper is trying to address.

Generating 3D models with smooth surfaces from cross-modality inputs
Minimizing storage overhead by using parameterized primitive representations
Replacing model elements with parameterized primitives for efficient prototyping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates 3D models using parameterized primitives from inputs
Replaces model elements with high-quality surface primitives
Stores models as small parameter files preserving quality
🔎 Similar Papers
No similar papers found.
Yiming Liang
Yiming Liang
Institute of Automation of the Chinese Academy Sciences (CASIA), M-A-P
LLM
H
Huan Yu
Zhejiang University, State Key Laboratory of Fluid Power & Mechatronic Systems, Hangzhou, 310058, Zhejiang, China; Zhejiang University, School of Mechanical Engineering, Hangzhou, 310058, Zhejiang, China; Robotics Research Center of Yuyao City, Yuyao, Ningbo, 315400, Zhejiang, China
Zili Wang
Zili Wang
StepFun LLM Researcher & M-A-P
Large Language ModelsCode Intelligence
S
Shuyou Zhang
Zhejiang University, State Key Laboratory of Fluid Power & Mechatronic Systems, Hangzhou, 310058, Zhejiang, China; Zhejiang University, School of Mechanical Engineering, Hangzhou, 310058, Zhejiang, China
G
Guodong Yi
Zhejiang University, State Key Laboratory of Fluid Power & Mechatronic Systems, Hangzhou, 310058, Zhejiang, China; Zhejiang University, School of Mechanical Engineering, Hangzhou, 310058, Zhejiang, China
J
Jin Wang
Zhejiang University, State Key Laboratory of Fluid Power & Mechatronic Systems, Hangzhou, 310058, Zhejiang, China; Zhejiang University, School of Mechanical Engineering, Hangzhou, 310058, Zhejiang, China; Robotics Research Center of Yuyao City, Yuyao, Ningbo, 315400, Zhejiang, China
J
Jianrong Tan
Zhejiang University, State Key Laboratory of Fluid Power & Mechatronic Systems, Hangzhou, 310058, Zhejiang, China; Zhejiang University, School of Mechanical Engineering, Hangzhou, 310058, Zhejiang, China