DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Embodied intelligence struggles with transferability to real-world complex bimanual coordination and fault-tolerant interaction tasks, largely due to oversimplified robot morphologies and neglect of low-level execution stochasticity in existing simulators. Method: We introduce DualTHOR—the first high-fidelity physics-based simulator tailored for humanoid bimanual robots—built upon AI2-THOR. It integrates realistic robot assets, a bimanual task suite, a humanoid inverse kinematics solver, and a physics-grounded contingency modeling framework featuring controllable failure injection. DualTHOR supports hybrid Unity/PyBullet simulation and provides a vision-language model (VLM)-driven task planning interface. Contribution/Results: Experiments reveal critical deficiencies in current VLMs for bimanual coordination and fault-tolerant execution. DualTHOR significantly improves the fidelity of robustness evaluation and enables more rigorous validation of real-world transferability.

Technology Category

Application Category

📝 Abstract
Developing embodied agents capable of performing complex interactive tasks in real-world scenarios remains a fundamental challenge in embodied AI. Although recent advances in simulation platforms have greatly enhanced task diversity to train embodied Vision Language Models (VLMs), most platforms rely on simplified robot morphologies and bypass the stochastic nature of low-level execution, which limits their transferability to real-world robots. To address these issues, we present a physics-based simulation platform DualTHOR for complex dual-arm humanoid robots, built upon an extended version of AI2-THOR. Our simulator includes real-world robot assets, a task suite for dual-arm collaboration, and inverse kinematics solvers for humanoid robots. We also introduce a contingency mechanism that incorporates potential failures through physics-based low-level execution, bridging the gap to real-world scenarios. Our simulator enables a more comprehensive evaluation of the robustness and generalization of VLMs in household environments. Extensive evaluations reveal that current VLMs struggle with dual-arm coordination and exhibit limited robustness in realistic environments with contingencies, highlighting the importance of using our simulator to develop more capable VLMs for embodied tasks. The code is available at https://github.com/ds199895/DualTHOR.git.
Problem

Research questions and friction points this paper is trying to address.

Simulating dual-arm humanoid robots for real-world tasks
Addressing low-level execution stochasticity in robot simulations
Enhancing robustness of Vision Language Models in contingencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-based simulation for dual-arm humanoid robots
Incorporates contingency mechanism for realistic failures
Inverse kinematics solvers for humanoid robot control
🔎 Similar Papers
No similar papers found.
B
Boyu Li
Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences; Beijing Academy of Artificial Intelligence
S
Siyuan He
AgiBot
H
Hang Xu
AgiBot
Haoqi Yuan
Haoqi Yuan
Peking University
Machine LearningReinforcement LearningEmbodied AI
Y
Yu Zang
AgiBot
L
Liwei Hu
AgiBot
J
Junpeng Yue
School of Computer Science, Peking University
Z
Zhenxiong Jiang
AgiBot
Pengbo Hu
Pengbo Hu
University of Science and Technology of China
Embodied IntelligenceMulti-modal AlgorithmAutonomous AgentAgentic World Building
B
Borje F. Karlsson
Beijing Academy of Artificial Intelligence
Yehui Tang
Yehui Tang
Shanghai Jiao Tong University
Machine LearningQuantum AI & AI4Science
Zongqing Lu
Zongqing Lu
Peking University | BeingBeyond
Reinforcement learning