🤖 AI Summary
To address the challenges of action planning and implicit causal reasoning in complex object–object interaction tasks under unknown dynamics, this paper proposes a causal-exploration-driven action planning framework. Methodologically, it introduces the first Bayesian optimization paradigm that jointly integrates causal inference with physical priors, featuring a Physics-Informed Kernel to model causal effects—enabling sample-efficient, robust iterative physical reasoning and action optimization. Evaluated on Virtual Tools and PHYRE benchmarks, the approach surpasses all state-of-the-art methods in action efficiency, achieving human-expert-level performance on high-difficulty tasks; its generalizability and practical utility are further validated through a newly conducted PHYRE user study. The core contribution is the first differentiable, interpretable, and physically consistent causal Bayesian optimization framework, effectively bridging causal discovery and embodied physical reasoning.
📝 Abstract
Tasks that involve complex interactions between objects with unknown dynamics make planning before execution difficult. These tasks require agents to iteratively improve their actions after actively exploring causes and effects in the environment. For these type of tasks, we propose Causal-PIK, a method that leverages Bayesian optimization to reason about causal interactions via a Physics-Informed Kernel to help guide efficient search for the best next action. Experimental results on Virtual Tools and PHYRE physical reasoning benchmarks show that Causal-PIK outperforms state-of-the-art results, requiring fewer actions to reach the goal. We also compare Causal-PIK to human studies, including results from a new user study we conducted on the PHYRE benchmark. We find that Causal-PIK remains competitive on tasks that are very challenging, even for human problem-solvers.