HAIC: Humanoid Agile Object Interaction Control via Dynamics-Aware World Model

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This work addresses the challenge of humanoid robots interacting with underactuated, independently dynamic, and nonholonomically constrained objects in unstructured environments—scenarios often plagued by coupling forces, visual occlusions, and the absence of external state estimation. To this end, the authors propose HAIC, a unified framework that leverages only proprioceptive history to predict high-order object states and integrates geometric priors to construct a spatially anchored dynamic occupancy map. This enables the policy to infer collision boundaries and contact likelihood even in visual blind spots. Key innovations include a proprioception-based dynamics predictor, a geometry-guided dynamic occupancy map, and an asymmetric fine-tuning mechanism between the world model and policy, which collectively enhance robustness under distributional shift. Experiments demonstrate high success rates in tasks such as skateboard manipulation, variable-load pushing/pulling, and long-horizon object transport across diverse terrains, with active compensation for inertial disturbances.

Technology Category

Application Category

📝 Abstract

Humanoid robots show promise for complex whole-body tasks in unstructured environments. Although Human-Object Interaction (HOI) has advanced, most methods focus on fully actuated objects rigidly coupled to the robot, ignoring underactuated objects with independent dynamics and non-holonomic constraints. These introduce control challenges from coupling forces and occlusions. We present HAIC, a unified framework for robust interaction across diverse object dynamics without external state estimation. Our key contribution is a dynamics predictor that estimates high-order object states (velocity, acceleration) solely from proprioceptive history. These predictions are projected onto static geometric priors to form a spatially grounded dynamic occupancy map, enabling the policy to infer collision boundaries and contact affordances in blind spots. We use asymmetric fine-tuning, where a world model continuously adapts to the student policy's exploration, ensuring robust state estimation under distribution shifts. Experiments on a humanoid robot show HAIC achieves high success rates in agile tasks (skateboarding, cart pushing/pulling under various loads) by proactively compensating for inertial perturbations, and also masters multi-object long-horizon tasks like carrying a box across varied terrain by predicting the dynamics of multiple objects.

Problem

Research questions and friction points this paper is trying to address.

Human-Object Interaction

underactuated objects

non-holonomic constraints

dynamics-aware control

occlusions

Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamics-aware world model

proprioceptive state estimation

dynamic occupancy map