🤖 AI Summary
Sim-to-real transfer for dexterous manipulation often fails due to mismatches between simulated and real-world low-level controller dynamics, leading to erroneous contact forces and behavioral deviations.
Method: This paper proposes an end-to-end adaptive framework that jointly learns both the action policy and low-level controller parameters. It explicitly incorporates controller parameters into the observation space and introduces an online parameter adaptation mechanism driven jointly by historical trajectories and controller state, leveraging LSTM or Transformer architectures for temporal modeling.
Contribution/Results: The method eliminates the need for manual tuning or aggressive domain randomization. Evaluated on multiple variable-force dexterous manipulation tasks, it significantly improves deployment success rates on real robots and effectively reduces discrepancies in force interaction and motion behavior between simulation and reality.
📝 Abstract
Dexterous manipulation has seen remarkable progress in recent years, with policies capable of executing many complex and contact-rich tasks in simulation. However, transferring these policies from simulation to real world remains a significant challenge. One important issue is the mismatch in low-level controller dynamics, where identical trajectories can lead to vastly different contact forces and behaviors when control parameters vary. Existing approaches often rely on manual tuning or controller randomization, which can be labor-intensive, task-specific, and introduce significant training difficulty. In this work, we propose a framework that jointly learns actions and controller parameters based on the historical information of both trajectory and controller. This adaptive controller adjustment mechanism allows the policy to automatically tune control parameters during execution, thereby mitigating the sim-to-real gap without extensive manual tuning or excessive randomization. Moreover, by explicitly providing controller parameters as part of the observation, our approach facilitates better reasoning over force interactions and improves robustness in real-world scenarios. Experimental results demonstrate that our method achieves improved transfer performance across a variety of dexterous tasks involving variable force conditions.