CoTaP: Compliant Task Pipeline and Reinforcement Learning of Its Controller with Compliance Modulation

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

To address the challenge of achieving compliant physical interaction for humanoid robots in real-world environments—where the absence of force feedback impedes adaptability—this paper proposes a two-stage whole-body motion control framework integrating learning-based and model-based approaches. Methodologically, Stage I employs dual-agent reinforcement learning to train and distill a robust position-control base policy; Stage II introduces task-space compliance modulation on the symmetric positive-definite (SPD) manifold, enabling model-driven, coordinated upper- and lower-body compliant control. The key contribution lies in formulating compliance parameters directly on the SPD manifold, thereby ensuring geometric consistency and closed-loop stability, and—uniquely—enabling end-to-end co-optimization of learned policies with manifold-constrained compliance control. Simulation results demonstrate substantial improvements in robustness against external disturbances and adaptability during environmental interaction.

Technology Category

Application Category

📝 Abstract

Humanoid whole-body locomotion control is a critical approach for humanoid robots to leverage their inherent advantages. Learning-based control methods derived from retargeted human motion data provide an effective means of addressing this issue. However, because most current human datasets lack measured force data, and learning-based robot control is largely position-based, achieving appropriate compliance during interaction with real environments remains challenging. This paper presents Compliant Task Pipeline (CoTaP): a pipeline that leverages compliance information in the learning-based structure of humanoid robots. A two-stage dual-agent reinforcement learning framework combined with model-based compliance control for humanoid robots is proposed. In the training process, first a base policy with a position-based controller is trained; then in the distillation, the upper-body policy is combined with model-based compliance control, and the lower-body agent is guided by the base policy. In the upper-body control, adjustable task-space compliance can be specified and integrated with other controllers through compliance modulation on the symmetric positive definite (SPD) manifold, ensuring system stability. We validated the feasibility of the proposed strategy in simulation, primarily comparing the responses to external disturbances under different compliance settings.

Problem

Research questions and friction points this paper is trying to address.

Achieving compliant humanoid locomotion without force data

Integrating compliance modulation into learning-based control framework

Developing dual-agent reinforcement learning with model-based compliance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage dual-agent reinforcement learning for control

Model-based compliance control on SPD manifold

Compliance modulation integrated with position controllers

🔎 Similar Papers

PWM: Policy Learning with Multi-Task World Models