Exploring Pose-Guided Imitation Learning for Robotic Precise Insertion

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing imitation learning methods for high-precision robotic insertion suffer from low accuracy, reliance on redundant image/point-cloud observations, and poor sample efficiency. To address these challenges, this paper proposes the first SE(3)-relative-pose-guided imitation learning framework. Our method unifies observation and action representation via SE(3) relative pose; introduces a target-conditioned RGB-D encoder coupled with a pose-guided residual gated fusion mechanism to adaptively integrate geometric priors and visual details; and jointly models policies via diffusion-based trajectory generation and SE(3) pose trajectory prediction. Evaluated on six fine-grained insertion tasks, our approach achieves ≈0.01 mm positioning accuracy using only 7–10 demonstrations—substantially outperforming state-of-the-art methods—while demonstrating strong generalization and unprecedented sample efficiency.

Technology Category

Application Category

📝 Abstract

Recent studies have proved that imitation learning shows strong potential in the field of robotic manipulation. However, existing methods still struggle with precision manipulation task and rely on inefficient image/point cloud observations. In this paper, we explore to introduce SE(3) object pose into imitation learning and propose the pose-guided efficient imitation learning methods for robotic precise insertion task. First, we propose a precise insertion diffusion policy which utilizes the relative SE(3) pose as the observation-action pair. The policy models the source object SE(3) pose trajectory relative to the target object. Second, we explore to introduce the RGBD data to the pose-guided diffusion policy. Specifically, we design a goal-conditioned RGBD encoder to capture the discrepancy between the current state and the goal state. In addition, a pose-guided residual gated fusion method is proposed, which takes pose features as the backbone, and the RGBD features selectively compensate for pose feature deficiencies through an adaptive gating mechanism. Our methods are evaluated on 6 robotic precise insertion tasks, demonstrating competitive performance with only 7-10 demonstrations. Experiments demonstrate that the proposed methods can successfully complete precision insertion tasks with a clearance of about 0.01 mm. Experimental results highlight its superior efficiency and generalization capability compared to existing baselines. Code will be available at https://github.com/sunhan1997/PoseInsert.

Problem

Research questions and friction points this paper is trying to address.

Improving robotic precision insertion using SE(3) pose-guided imitation learning

Enhancing efficiency by integrating RGBD data with pose-guided diffusion policy

Achieving high accuracy (0.01mm clearance) with minimal demonstrations (7-10)

Innovation

Methods, ideas, or system contributions that make the work stand out.

SE(3) pose-guided imitation learning for precision

Goal-conditioned RGBD encoder enhances state discrepancy

Pose-guided residual gated fusion adaptively compensates features

🔎 Similar Papers

Gaze-Based Dual Resolution Deep Imitation Learning for High-Precision Dexterous Robot Manipulation

2021-02-02IEEE Robotics and Automation LettersCitations: 24

Authors to Follow