Particulate: Feed-Forward 3D Object Articulation

📅 2025-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods lack an end-to-end solution for inferring complete articulation structures—including 3D movable parts, kinematic topology, and motion constraints—directly from a single static 3D mesh. Method: We propose the first end-to-end feed-forward framework, Part Articulation Transformer (PAT), which operates directly on point cloud representations of static meshes and is trained end-to-end on a large-scale articulated 3D dataset; PAT natively supports multi-joint modeling and is the first method adapted to AI-generated 3D assets. Contribution/Results: We introduce a human-centered benchmark and evaluation protocol tailored to articulated structure inference. Experiments demonstrate that PAT significantly outperforms prior work in accuracy, generalization across diverse object categories, and inference speed (sub-second latency), enabling a fully automatic pipeline: “single image → 3D generation → articulation structure extraction.”

Technology Category

Application Category

📝 Abstract
We present Particulate, a feed-forward approach that, given a single static 3D mesh of an everyday object, directly infers all attributes of the underlying articulated structure, including its 3D parts, kinematic structure, and motion constraints. At its core is a transformer network, Part Articulation Transformer, which processes a point cloud of the input mesh using a flexible and scalable architecture to predict all the aforementioned attributes with native multi-joint support. We train the network end-to-end on a diverse collection of articulated 3D assets from public datasets. During inference, Particulate lifts the network's feed-forward prediction to the input mesh, yielding a fully articulated 3D model in seconds, much faster than prior approaches that require per-object optimization. Particulate can also accurately infer the articulated structure of AI-generated 3D assets, enabling full-fledged extraction of articulated 3D objects from a single (real or synthetic) image when combined with an off-the-shelf image-to-3D generator. We further introduce a new challenging benchmark for 3D articulation estimation curated from high-quality public 3D assets, and redesign the evaluation protocol to be more consistent with human preferences. Quantitative and qualitative results show that Particulate significantly outperforms state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

Infer articulated structure from single static 3D mesh
Predict 3D parts, kinematic structure, motion constraints
Enable fast feed-forward articulation without per-object optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward transformer network for 3D articulation
Direct inference of parts, kinematics, and constraints
Fast articulated model generation from single mesh
🔎 Similar Papers
No similar papers found.