🤖 AI Summary
This work addresses the challenge of high-precision contact tasks, where conventional pose control struggles to explicitly regulate contact forces under state estimation uncertainty, often leading to task failure or component damage. To overcome this limitation, the authors propose a hybrid position-force control strategy that dynamically selects the control mode along each Cartesian dimension via reinforcement learning. Central to this approach is the Mode-Aware Training for Contact Handling (MATCH) algorithm, which explicitly models mode-switching behavior to enhance learning efficiency in complex action spaces. Integrated with a low-level impedance control interface, the method achieves up to a 10% increase in task success rate and a fivefold reduction in peg breakage under extreme pose uncertainty. Real-robot experiments demonstrate that, in high-noise scenarios, success rates improve from 33% to 68%, accompanied by an approximate 30% reduction in average contact force.
📝 Abstract
Reinforcement learning-based control policies have been frequently demonstrated to be more effective than analytical techniques for many manipulation tasks. Commonly, these methods learn neural control policies that predict end-effector pose changes directly from observed state information. For tasks like inserting delicate connectors which induce force constraints, pose-based policies have limited explicit control over force and rely on carefully tuned low-level controllers to avoid executing damaging actions. In this work, we present hybrid position-force control policies that learn to dynamically select when to use force or position control in each control dimension. To improve learning efficiency of these policies, we introduce Mode-Aware Training for Contact Handling (MATCH) which adjusts policy action probabilities to explicitly mirror the mode selection behavior in hybrid control. We validate MATCH's learned policy effectiveness using fragile peg-in-hole tasks under extreme localization uncertainty. We find MATCH substantially outperforms pose-control policies -- solving these tasks with up to 10% higher success rates and 5x fewer peg breaks than pose-only policies under common types of state estimation error. MATCH also demonstrates data efficiency equal to pose-control policies, despite learning in a larger and more complex action space. In over 1600 sim-to-real experiments, we find MATCH succeeds twice as often as pose policies in high noise settings (33% vs.~68%) and applies ~30% less force on average compared to variable impedance policies on a Franka FR3 in laboratory conditions.