Contact-Safe Reinforcement Learning with ProMP Reparameterization and Energy Awareness

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Traditional MDP-based reinforcement learning (RL) for contact-rich robotic manipulation lacks task-space contact awareness when modeling in joint space, compromising safety, trajectory consistency, and task robustness. Method: We propose an energy-aware RL framework operating directly in task space. It reparameterizes policies using probabilistic movement primitives (ProMPs) and integrates Cartesian impedance control with explicit energy constraints as a safety prior, enabling safe and smooth interaction with complex 3D surfaces. The framework combines proximal policy optimization (PPO) with energy-aware impedance objectives to jointly optimize for contact safety, trajectory continuity, and task success. Results: Experiments on diverse 3D surface manipulation tasks demonstrate that our approach significantly outperforms baseline methods—achieving higher task success rates, smoother trajectories, and more stable, reliable physical interactions—while maintaining strict adherence to energy-based safety bounds during contact.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) approaches based on Markov Decision Processes (MDPs) are predominantly applied in the robot joint space, often relying on limited task-specific information and partial awareness of the 3D environment. In contrast, episodic RL has demonstrated advantages over traditional MDP-based methods in terms of trajectory consistency, task awareness, and overall performance in complex robotic tasks. Moreover, traditional step-wise and episodic RL methods often neglect the contact-rich information inherent in task-space manipulation, especially considering the contact-safety and robustness. In this work, contact-rich manipulation tasks are tackled using a task-space, energy-safe framework, where reliable and safe task-space trajectories are generated through the combination of Proximal Policy Optimization (PPO) and movement primitives. Furthermore, an energy-aware Cartesian Impedance Controller objective is incorporated within the proposed framework to ensure safe interactions between the robot and the environment. Our experimental results demonstrate that the proposed framework outperforms existing methods in handling tasks on various types of surfaces in 3D environments, achieving high success rates as well as smooth trajectories and energy-safe interactions.

Problem

Research questions and friction points this paper is trying to address.

Addresses contact-rich manipulation tasks in robotics with safety constraints

Overcomes limitations of traditional RL methods in 3D environment awareness

Ensures safe robot-environment interactions through energy-aware impedance control

Innovation

Methods, ideas, or system contributions that make the work stand out.

ProMP reparameterization for task-space trajectories

Energy-aware Cartesian Impedance Controller integration

PPO combined with movement primitives for safety

🔎 Similar Papers

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation