🤖 AI Summary
Achieving sub-millimeter (<1 mm) precision insertion under Sim2Real transfer remains challenging due to modeling inaccuracies and sensorimotor discrepancies.
Method: We propose a potential-field-guided residual reinforcement learning (RL) framework. A SE(3)-aware potential-field controller provides stable, interpretable priors, while a residual RL network—trained end-to-end in simulation—refines actions with minimal deviation. We incorporate dual curriculum learning (observation noise and action magnitude), sparse reward training, and vision-based SE(3) target tracking.
Contribution/Results: This work is the first to tightly integrate physically grounded, interpretable potential-field control with data-efficient residual RL, enabling zero-shot, real-world deployment without fine-tuning on physical hardware. Experiments demonstrate superior performance over pure RL and existing hybrid approaches across multiple objects and operating conditions, achieving real-time, robust sub-millimeter insertion accuracy.
📝 Abstract
Object insertion under tight tolerances ($<hspace{-.02in} 1mm$) is an important but challenging assembly task as even small errors can result in undesirable contacts. Recent efforts focused on Reinforcement Learning (RL), which often depends on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved insertion accuracy. The policy is trained exclusively in simulation and is zero-shot transferred to the real system. It employs a potential field-based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with residual RL, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL-based methods in this domain and prior efforts with hybrid policies. Ablations highlight the impact of each component of the approach.