🤖 AI Summary
This work addresses the challenge that robots struggle to effectively reason about kinematic and dynamic constraints when interacting with the physical world. To this end, the authors introduce a physics reasoning benchmark tailored for robot learning and planning, which systematically disentangles perception, language, and task complexity for the first time. The benchmark isolates five core categories of physical reasoning challenges and provides 25 procedurally generated environments, a standardized evaluation suite, and a sim-to-real validation framework. Integrated as a Gymnasium-compatible library, it supports parametric skill demonstration, task and motion planning, imitation learning, reinforcement learning, and foundation model approaches, with 13 representative baselines included. Empirical results reveal that existing methods perform poorly across most scenarios, highlighting significant limitations in current physical reasoning capabilities. The benchmark is open-sourced to foster progress in the field.
📝 Abstract
Robotic systems that interact with the physical world must reason about kinematic and dynamic constraints imposed by their own embodiment, their environment, and the task at hand. We introduce KinDER, a benchmark for Kinematic and Dynamic Embodied Reasoning that targets physical reasoning challenges arising in robot learning and planning. KinDER comprises 25 procedurally generated environments, a Gymnasium-compatible Python library with parameterized skills and demonstrations, and a standardized evaluation suite with 13 implemented baselines spanning task and motion planning, imitation learning, reinforcement learning, and foundation-model-based approaches. The environments are designed to isolate five core physical reasoning challenges: basic spatial relations, nonprehensile multi-object manipulation, tool use, combinatorial geometric constraints, and dynamic constraints, disentangled from perception, language understanding, and application-specific complexity. Empirical evaluation shows that existing methods struggle to solve many of the environments, indicating substantial gaps in current approaches to physical reasoning. We additionally include real-to-sim-to-real experiments on a mobile manipulator to assess the correspondence between simulation and real-world physical interaction. KinDER is fully open-sourced and intended to enable systematic comparison across diverse paradigms for advancing physical reasoning in robotics. Website and code: https://prpl-group.com/kinder-site/