π€ AI Summary
In humanβrobot collaboration (HRC), safety mechanisms often compromise task efficiency, leading to frequent human interventions, robot fallback motions, task failures, and repeated motion replanning. To address this, we propose a two-tiered collaborative framework integrating reinforcement learning (RL) and interactive motion planning: an RL-based task-level policy adaptively selects high-level actions, while a motion-level planner generates safe, dynamically feasible trajectories conditioned on real-time human pose estimation; bidirectional feedback between layers jointly optimizes safety and efficiency. Evaluated in both simulation and on a physical collaborative robot platform, our framework significantly reduces target instruction repetition (β42%) and replanning frequency (β58%) compared to conventional hard-coded approaches, while maintaining collision avoidance and high task success rates. This enables a principled trade-off between safety and efficiency in dynamic, unstructured environments.
π Abstract
In a Human-Robot Cooperation (HRC) environment, safety and efficiency are the two core properties to evaluate robot performance. However, safety mechanisms usually hinder task efficiency since human intervention will cause backup motions and goal failures of the robot. Frequent motion replanning will increase the computational load and the chance of failure. In this paper, we present a hybrid Reinforcement Learning (RL) planning framework which is comprised of an interactive motion planner and a RL task planner. The RL task planner attempts to choose statistically safe and efficient task sequences based on the feedback from the motion planner, while the motion planner keeps the task execution process collision-free by detecting human arm motions and deploying new paths when the previous path is not valid anymore. Intuitively, the RL agent will learn to avoid dangerous tasks, while the motion planner ensures that the chosen tasks are safe. The proposed framework is validated on the cobot in both simulation and the real world, we compare the planner with hard-coded task motion planning methods. The results show that our planning framework can 1) react to uncertain human motions at both joint and task levels; 2) reduce the times of repeating failed goal commands; 3) reduce the total number of replanning requests.