UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the end-to-end generation of 3D animatable objects with open-set articulated structures from a single input image, overcoming the high modeling cost and poor generalizability of conventional approaches. We propose a unified implicit representation framework that jointly encodes geometry, texture, part segmentation, and kinematic parameters. A reversible joint-voxel embedding mechanism is introduced to precisely align joint semantics with voxel space. Joint type prediction is formulated as an open-set classification task, enabling generalization to unseen joint categories and object types. Leveraging a diffusion model, we co-optimize voxel-based geometry and joint semantics within a shared latent space. Evaluated on PartNet-Mobility, our method significantly outperforms multi-stage baselines, achieving state-of-the-art performance in mesh quality and joint motion accuracy.

Technology Category

Application Category

📝 Abstract
Articulated 3D objects play a vital role in realistic simulation and embodied robotics, yet manually constructing such assets remains costly and difficult to scale. In this paper, we present UniArt, a diffusion-based framework that directly synthesizes fully articulated 3D objects from a single image in an end-to-end manner. Unlike prior multi-stage techniques, UniArt establishes a unified latent representation that jointly encodes geometry, texture, part segmentation, and kinematic parameters. We introduce a reversible joint-to-voxel embedding, which spatially aligns articulation features with volumetric geometry, enabling the model to learn coherent motion behaviors alongside structural formation. Furthermore, we formulate articulation type prediction as an open-set problem, removing the need for fixed joint semantics and allowing generalization to novel joint categories and unseen object types. Experiments on the PartNet-Mobility benchmark demonstrate that UniArt achieves state-of-the-art mesh quality and articulation accuracy.
Problem

Research questions and friction points this paper is trying to address.

Generates articulated 3D objects from single images
Unifies geometry, texture, segmentation, and kinematics in one representation
Predicts open-set articulation types for novel joints and objects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified latent representation for geometry, texture, segmentation, kinematics
Reversible joint-to-voxel embedding aligns articulation with volumetric geometry
Open-set articulation type prediction generalizes to novel joint categories
🔎 Similar Papers
No similar papers found.