PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D shape primitive abstraction methods suffer from either semantic deficiency or poor generalization: geometric optimization approaches lack semantic understanding, while data-driven methods are constrained by small-scale, single-category training sets. This paper introduces the first end-to-end learning framework tailored to large-scale, human-abstracted assembly data, formulating primitive assembly as a shape-conditioned autoregressive generation task. Key contributions include: (1) an unambiguous, unified parameterization scheme for diverse geometric primitives—including spheres, cylinders, and cuboids—enabling joint representation; and (2) a geometry-aware discretized parameter encoding strategy coupled with a shape-conditioned Transformer architecture. Our method achieves high geometric fidelity, semantically plausible, and perceptually natural primitive assemblies across categories, significantly improving generalization and practical applicability—enabling downstream uses such as game UGC.

Technology Category

Application Category

📝 Abstract
Shape primitive abstraction, which decomposes complex 3D shapes into simple geometric elements, plays a crucial role in human visual cognition and has broad applications in computer vision and graphics. While recent advances in 3D content generation have shown remarkable progress, existing primitive abstraction methods either rely on geometric optimization with limited semantic understanding or learn from small-scale, category-specific datasets, struggling to generalize across diverse shape categories. We present PrimitiveAnything, a novel framework that reformulates shape primitive abstraction as a primitive assembly generation task. PrimitiveAnything includes a shape-conditioned primitive transformer for auto-regressive generation and an ambiguity-free parameterization scheme to represent multiple types of primitives in a unified manner. The proposed framework directly learns the process of primitive assembly from large-scale human-crafted abstractions, enabling it to capture how humans decompose complex shapes into primitive elements. Through extensive experiments, we demonstrate that PrimitiveAnything can generate high-quality primitive assemblies that better align with human perception while maintaining geometric fidelity across diverse shape categories. It benefits various 3D applications and shows potential for enabling primitive-based user-generated content (UGC) in games. Project page: https://primitiveanything.github.io
Problem

Research questions and friction points this paper is trying to address.

Generating human-like 3D primitive assemblies from complex shapes
Overcoming limited generalization in existing primitive abstraction methods
Unifying representation of multiple primitive types for diverse 3D categories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Auto-regressive transformer for primitive assembly generation
Ambiguity-free parameterization for unified primitive representation
Learning from large-scale human-crafted shape abstractions
🔎 Similar Papers
No similar papers found.
Jingwen Ye
Jingwen Ye
Assistant Professor, Monash University
Computer Vision
Y
Yuze He
Tsinghua University and Tencent AIPD, China
Yanning Zhou
Yanning Zhou
XPENG
computer visionmedical image analysis
Y
Yiqin Zhu
Tencent AIPD, China
K
Kaiwen Xiao
Tencent AIPD, China
W
Wei Yang
Tencent AIPD, China
X
Xiao Han
Tencent AIPD, China