ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-source articulable object datasets suffer from significant deficiencies in visual photorealism and physical fidelity, severely limiting their practical utility in robot learning. To address this, we introduce the first high-fidelity digital twin dataset of articulable objects specifically designed for robot learning, covering representative indoor scenes while jointly ensuring visual realism, physical accuracy, and modular interactivity. Our approach innovatively incorporates embedded modular interaction behavior modeling and pixel-level functional region annotation; leverages USD for unified asset encapsulation; and integrates optical motion capture validation, PBR-based high-resolution texturing, and fine-grained rigid-body dynamics parameter calibration. Experiments demonstrate substantial improvements in Sim2Real transfer performance, with consistent gains across both imitation learning and reinforcement learning benchmarks. The complete dataset—including all assets, annotations, and a comprehensive production pipeline—is publicly released under an open-source license.

Technology Category

Application Category

📝 Abstract
Robot learning increasingly relies on simulation to advance complex ability such as dexterous manipulations and precise interactions, necessitating high-quality digital assets to bridge the sim-to-real gap. However, existing open-source articulated-object datasets for simulation are limited by insufficient visual realism and low physical fidelity, which hinder their utility for training models mastering robotic tasks in real world. To address these challenges, we introduce ArtVIP, a comprehensive open-source dataset comprising high-quality digital-twin articulated objects, accompanied by indoor-scene assets. Crafted by professional 3D modelers adhering to unified standards, ArtVIP ensures visual realism through precise geometric meshes and high-resolution textures, while physical fidelity is achieved via fine-tuned dynamic parameters. Meanwhile, the dataset pioneers embedded modular interaction behaviors within assets and pixel-level affordance annotations. Feature-map visualization and optical motion capture are employed to quantitatively demonstrate ArtVIP's visual and physical fidelity, with its applicability validated across imitation learning and reinforcement learning experiments. Provided in USD format with detailed production guidelines, ArtVIP is fully open-source, benefiting the research community and advancing robot learning research. Our project is at https://x-humanoid-artvip.github.io/ .
Problem

Research questions and friction points this paper is trying to address.

Lack of visually realistic and physically accurate digital assets for robot learning
Limited open-source datasets for articulated-object simulation in robotics
Need for modular interaction behaviors and affordance annotations in simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

High-quality digital-twin articulated objects
Embedded modular interaction behaviors
Pixel-level affordance annotations
🔎 Similar Papers
No similar papers found.
Z
Zhao Jin
Beijing Innovation Center of Humanoid Robotics
Zhengping Che
Zhengping Che
X-Humanoid
Embodied AIDeep Learning
Z
Zhen Zhao
Beijing Innovation Center of Humanoid Robotics
K
Kun Wu
Beijing Innovation Center of Humanoid Robotics
Yuheng Zhang
Yuheng Zhang
University of Illinois Urbana-Champaign
Machine LearningReinforcement LearningOnline LearningBanditsLearning Theory
Yinuo Zhao
Yinuo Zhao
Phd, Beijing Institute of Technology
Deep reinforcement learningmobile crowdsensingrobot learning
Z
Zehui Liu
Beijing Innovation Center of Humanoid Robotics
Q
Qiang Zhang
Beijing Innovation Center of Humanoid Robotics
X
Xiaozhu Ju
Beijing Innovation Center of Humanoid Robotics
Jing Tian
Jing Tian
National University of Singapore
Video analyticsComputer visionIndustrial AI
Y
Yousong Xue
Beijing Institute of Architectural Design
J
Jian Tang
Beijing Innovation Center of Humanoid Robotics