FreeArt3D: Training-Free Articulated Object Generation using 3D Diffusion

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the challenge of generating high-fidelity, articulable 3D objects from limited or no labeled data and without task-specific training. Methodologically, it introduces the first extension of Score Distillation Sampling (SDS) to the 3D-to-4D generation domain, leveraging pre-trained static 3D diffusion models (e.g., Trellis) as geometric and textural priors to jointly optimize articulation kinematics, geometry, and appearance within a 4D latent space—enabling end-to-end, training-free reconstructable articulated structure synthesis. Key contributions include: (1) the first training-free, fine-tuning-free framework for 4D articulable 3D generation; (2) second-scale multi-view image-based reconstruction yielding models with accurate kinematic structures and high-fidelity geometry and texture; and (3) state-of-the-art performance across diverse object categories, demonstrating strong generalization and computational efficiency (generation in minutes).

Technology Category

Application Category

📝 Abstract

Articulated 3D objects are central to many applications in robotics, AR/VR, and animation. Recent approaches to modeling such objects either rely on optimization-based reconstruction pipelines that require dense-view supervision or on feed-forward generative models that produce coarse geometric approximations and often overlook surface texture. In contrast, open-world 3D generation of static objects has achieved remarkable success, especially with the advent of native 3D diffusion models such as Trellis. However, extending these methods to articulated objects by training native 3D diffusion models poses significant challenges. In this work, we present FreeArt3D, a training-free framework for articulated 3D object generation. Instead of training a new model on limited articulated data, FreeArt3D repurposes a pre-trained static 3D diffusion model (e.g., Trellis) as a powerful shape prior. It extends Score Distillation Sampling (SDS) into the 3D-to-4D domain by treating articulation as an additional generative dimension. Given a few images captured in different articulation states, FreeArt3D jointly optimizes the object's geometry, texture, and articulation parameters without requiring task-specific training or access to large-scale articulated datasets. Our method generates high-fidelity geometry and textures, accurately predicts underlying kinematic structures, and generalizes well across diverse object categories. Despite following a per-instance optimization paradigm, FreeArt3D completes in minutes and significantly outperforms prior state-of-the-art approaches in both quality and versatility.

Problem

Research questions and friction points this paper is trying to address.

Generating articulated 3D objects without training diffusion models

Overcoming limited articulated data for high-fidelity geometry and textures

Jointly optimizing geometry, texture, and articulation from few images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for articulated 3D generation

Repurposes static 3D diffusion model as shape prior

Extends Score Distillation Sampling to 3D-to-4D domain

🔎 Similar Papers

Survey on Modeling of Human-made Articulated Objects