ArtiWorld: LLM-Driven Articulation of 3D Objects in Scenes

📅 2025-11-16

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

To address the high cost, labor-intensive modeling, and poor scalability of articulated asset creation in robotic simulation environments, this paper proposes Arti4URDF—a novel method that enables end-to-end automatic generation of URDF models directly executable in simulation from static 3D scenes (e.g., point clouds). Arti4URDF integrates point-cloud geometric analysis, large language models (LLMs) encoding physical commonsense priors, and URDF-structured prompt engineering to support text-guided identification of movable parts, kinematic chain inference, and geometry-preserving joint constraint modeling. Evaluated on both synthetic and real-world scanned scenes, it achieves state-of-the-art performance, producing URDF models with high geometric fidelity and out-of-the-box simulation compatibility. The approach significantly lowers the barrier to constructing interactive robotic simulation environments.

Technology Category

Application Category

📝 Abstract

Building interactive simulators and scalable robot-learning environments requires a large number of articulated assets. However, most existing 3D assets in simulation are rigid, and manually converting them into articulated objects is extremely labor- and cost-intensive. This raises a natural question: can we automatically identify articulable objects in a scene and convert them into articulated assets directly? In this paper, we present ArtiWorld, a scene-aware pipeline that localizes candidate articulable objects from textual scene descriptions and reconstructs executable URDF models that preserve the original geometry. At the core of this pipeline is Arti4URDF, which leverages 3D point cloud, prior knowledge of a large language model (LLM), and a URDF-oriented prompt design to rapidly convert rigid objects into interactive URDF-based articulated objects while maintaining their 3D shape. We evaluate ArtiWorld at three levels: 3D simulated objects, full 3D simulated scenes, and real-world scan scenes. Across all three settings, our method consistently outperforms existing approaches and achieves state-of-the-art performance, while preserving object geometry and correctly capturing object interactivity to produce usable URDF-based articulated models. This provides a practical path toward building interactive, robot-ready simulation environments directly from existing 3D assets. Code and data will be released.

Problem

Research questions and friction points this paper is trying to address.

Automating identification of articulable objects in 3D scenes

Converting rigid 3D assets into interactive articulated models

Creating scalable robot-learning environments from existing assets

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven articulation of rigid 3D objects

URDF-oriented prompt design for shape preservation

Automated conversion to interactive articulated models

🔎 Similar Papers

No similar papers found.