Learning 3D Representations from Procedural 3D Programs

📅 2024-11-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

146K/year

🤖 AI Summary

To address the scarcity of annotated 3D point clouds, high acquisition costs, and copyright restrictions on real-world scans, this paper proposes a self-supervised representation learning paradigm that requires neither manual annotations nor real-world data. Instead, it pretrains models exclusively on procedurally generated, semantics-free 3D shapes—constructed from elementary geometric primitives and rigid transformations. We provide the first theoretical and empirical evidence that purely procedural data suffices to learn geometric representations with strong generalization capability, matching the performance of methods trained on real semantic models (e.g., airplanes, chairs). Our approach integrates procedural modeling, contrastive learning, PointNet++-based point cloud encoders, and masked reconstruction pretraining. It achieves state-of-the-art results on downstream tasks including shape classification, part segmentation, and point cloud completion. These findings reveal that current self-supervised 3D learning fundamentally captures low-level geometric structure rather than high-level semantics.

Technology Category

Application Category

📝 Abstract

Self-supervised learning has emerged as a promising approach for acquiring transferable 3D representations from unlabeled 3D point clouds. Unlike 2D images, which are widely accessible, acquiring 3D assets requires specialized expertise or professional 3D scanning equipment, making it difficult to scale and raising copyright concerns. To address these challenges, we propose learning 3D representations from procedural 3D programs that automatically generate 3D shapes using simple primitives and augmentations. Remarkably, despite lacking semantic content, the 3D representations learned from this synthesized dataset perform on par with state-of-the-art representations learned from semantically recognizable 3D models (e.g., airplanes) across various downstream 3D tasks, including shape classification, part segmentation, and masked point cloud completion. Our analysis further suggests that current self-supervised learning methods primarily capture geometric structures rather than high-level semantics.

Problem

Research questions and friction points this paper is trying to address.

Learning 3D representations from unlabeled procedural 3D programs

Overcoming scalability and copyright issues in 3D data acquisition

Evaluating self-supervised 3D representation performance on downstream tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning for 3D representations

Procedural 3D programs generate training shapes

Semantics-free 3D representations match state-of-the-art

🔎 Similar Papers

SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements