FreeArt3D: Training-Free Articulated Object Generation using 3D Diffusion

๐Ÿ“… 2025-10-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of generating high-fidelity, articulable 3D objects from limited or no labeled data and without task-specific training. Methodologically, it introduces the first extension of Score Distillation Sampling (SDS) to the 3D-to-4D generation domain, leveraging pre-trained static 3D diffusion models (e.g., Trellis) as geometric and textural priors to jointly optimize articulation kinematics, geometry, and appearance within a 4D latent spaceโ€”enabling end-to-end, training-free reconstructable articulated structure synthesis. Key contributions include: (1) the first training-free, fine-tuning-free framework for 4D articulable 3D generation; (2) second-scale multi-view image-based reconstruction yielding models with accurate kinematic structures and high-fidelity geometry and texture; and (3) state-of-the-art performance across diverse object categories, demonstrating strong generalization and computational efficiency (generation in minutes).

Technology Category

Application Category

๐Ÿ“ Abstract
Articulated 3D objects are central to many applications in robotics, AR/VR, and animation. Recent approaches to modeling such objects either rely on optimization-based reconstruction pipelines that require dense-view supervision or on feed-forward generative models that produce coarse geometric approximations and often overlook surface texture. In contrast, open-world 3D generation of static objects has achieved remarkable success, especially with the advent of native 3D diffusion models such as Trellis. However, extending these methods to articulated objects by training native 3D diffusion models poses significant challenges. In this work, we present FreeArt3D, a training-free framework for articulated 3D object generation. Instead of training a new model on limited articulated data, FreeArt3D repurposes a pre-trained static 3D diffusion model (e.g., Trellis) as a powerful shape prior. It extends Score Distillation Sampling (SDS) into the 3D-to-4D domain by treating articulation as an additional generative dimension. Given a few images captured in different articulation states, FreeArt3D jointly optimizes the object's geometry, texture, and articulation parameters without requiring task-specific training or access to large-scale articulated datasets. Our method generates high-fidelity geometry and textures, accurately predicts underlying kinematic structures, and generalizes well across diverse object categories. Despite following a per-instance optimization paradigm, FreeArt3D completes in minutes and significantly outperforms prior state-of-the-art approaches in both quality and versatility.
Problem

Research questions and friction points this paper is trying to address.

Generating articulated 3D objects without training diffusion models
Overcoming limited articulated data for high-fidelity geometry and textures
Jointly optimizing geometry, texture, and articulation from few images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for articulated 3D generation
Repurposes static 3D diffusion model as shape prior
Extends Score Distillation Sampling to 3D-to-4D domain
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Chuhao Chen
University of California San Diego, USA
Isabella Liu
Isabella Liu
University of California, San Diego
Computer VisionComputer Graphics
Xinyue Wei
Xinyue Wei
Hillbot
Computer GraphicsComputer VisionEmbodied AI
H
Hao Su
University of California San Diego, USA and Hillbot Inc., USA
Minghua Liu
Minghua Liu
Hillbot
3D VisionEmbodied AI