Accelerating Visual Reinforcement Learning with Separate Primitive Policy for Peg-in-Hole Tasks

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
For vision-guided peg-in-hole assembly, this paper proposes the Separated Primitive Policy (S2P) framework—the first to decouple binocular visual localization and force-constrained insertion into two jointly optimized, end-to-end learnable policy primitives. The method employs a deep visual encoder, action-space decomposition, a dual-branch policy network, and explicit force-feedback modeling, and is compatible with mainstream model-free reinforcement learning algorithms. Evaluated on ten polygonal peg-in-hole simulation tasks, S2P achieves a 98.7% success rate and improves sample efficiency by 2.3× over baselines. Real-robot experiments demonstrate cross-platform plug-and-play capability. This work significantly enhances policy interpretability, sample efficiency, and deployment robustness, establishing a new paradigm for vision–force fused RL control in precision assembly.

Technology Category

Application Category

📝 Abstract
For peg-in-hole tasks, humans rely on binocular visual perception to locate the peg above the hole surface and then proceed with insertion. This paper draws insights from this behavior to enable agents to learn efficient assembly strategies through visual reinforcement learning. Hence, we propose a Separate Primitive Policy (S2P) to simultaneously learn how to derive location and insertion actions. S2P is compatible with model-free reinforcement learning algorithms. Ten insertion tasks featuring different polygons are developed as benchmarks for evaluations. Simulation experiments show that S2P can boost the sample efficiency and success rate even with force constraints. Real-world experiments are also performed to verify the feasibility of S2P. Ablations are finally given to discuss the generalizability of S2P and some factors that affect its performance.
Problem

Research questions and friction points this paper is trying to address.

Enhancing peg-in-hole task efficiency with visual reinforcement learning
Developing Separate Primitive Policy for location and insertion actions
Improving sample efficiency and success rate under force constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Separate Primitive Policy for dual-action learning
Model-free reinforcement learning compatibility
Boosts efficiency and success with force constraints
🔎 Similar Papers
No similar papers found.
Z
Zichun Xu
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, Heilongjiang Province, China
Z
Zhaomin Wang
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, Heilongjiang Province, China
Yuntao Li
Yuntao Li
Peking University
Z
Zhuang Lei
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, Heilongjiang Province, China
Z
Zhiyuan Zhao
School of Mechanical Engineering, Shandong University, Jinan 250061, China
G
Guocai Yang
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, Heilongjiang Province, China
J
Jingdong Zhao
State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, Heilongjiang Province, China