MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D procedural reconstruction methods are limited by weak expressivity of domain-specific languages and insufficient training data, hindering accurate modeling of complex geometry and topology. This paper introduces the first large-scale paired dataset framework for point cloud-to-executable Blender Python script generation. We first design a high-expressivity subset of the Blender API and synthesize a million-scale object–code paired dataset. Then, we propose a multimodal large language model that enables end-to-end generation of semantically decomposed, editable scripts from raw point clouds. Our method significantly outperforms prior approaches on shape-to-code reconstruction, supporting fine-grained geometric editing and topological restructuring. It establishes, for the first time, a closed-loop generation pipeline from raw perceptual input to programmable, editable 3D content—advancing procedural 3D understanding and controllable 3D generation.

Technology Category

Application Category

📝 Abstract
Reconstructing 3D objects into editable programs is pivotal for applications like reverse engineering and shape editing. However, existing methods often rely on limited domain-specific languages (DSLs) and small-scale datasets, restricting their ability to model complex geometries and structures. To address these challenges, we introduce MeshCoder, a novel framework that reconstructs complex 3D objects from point clouds into editable Blender Python scripts. We develop a comprehensive set of expressive Blender Python APIs capable of synthesizing intricate geometries. Leveraging these APIs, we construct a large-scale paired object-code dataset, where the code for each object is decomposed into distinct semantic parts. Subsequently, we train a multimodal large language model (LLM) that translates 3D point cloud into executable Blender Python scripts. Our approach not only achieves superior performance in shape-to-code reconstruction tasks but also facilitates intuitive geometric and topological editing through convenient code modifications. Furthermore, our code-based representation enhances the reasoning capabilities of LLMs in 3D shape understanding tasks. Together, these contributions establish MeshCoder as a powerful and flexible solution for programmatic 3D shape reconstruction and understanding.
Problem

Research questions and friction points this paper is trying to address.

Reconstructs 3D objects into editable programs from point clouds
Overcomes limitations of domain-specific languages and small datasets
Generates executable Blender Python scripts for complex geometries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates Blender Python scripts from point clouds
Uses multimodal LLM for 3D-to-code translation
Creates expressive APIs for complex geometry synthesis
🔎 Similar Papers
No similar papers found.
Bingquan Dai
Bingquan Dai
Tsinghua University
LLM3D
L
Li Ray Luo
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Q
Qihong Tang
Shanghai Artificial Intelligence Laboratory, Shanghai, China
J
Jie Wang
Shanghai Artificial Intelligence Laboratory, Shanghai, China
X
Xinyu Lian
Shanghai Artificial Intelligence Laboratory, Shanghai, China
H
Hao Xu
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Minghan Qin
Minghan Qin
Bytedance Research | Tsinghua University
Computer Vision3D Vision3D Scene Perception
X
Xudong Xu
Shanghai Artificial Intelligence Laboratory, Shanghai, China
B
Bo Dai
Shanghai Artificial Intelligence Laboratory, Shanghai, China
H
Haoqian Wang
Tsinghua University, Beijing, China
Zhaoyang Lyu
Zhaoyang Lyu
PhD of Information Engineering, The Chinese University of Hong Kong
machine learning
J
Jiangmiao Pang
Shanghai Artificial Intelligence Laboratory, Shanghai, China