🤖 AI Summary
Trajectory optimization for redundant manipulators is often inefficient due to poor initial trajectories, large search spaces, and the absence of prior knowledge regarding configuration-space constraints. To address this, this work proposes an example-guided reinforcement learning approach that, for the first time, effectively incorporates motion feasibility constraints from expert demonstrations into a reinforcement learning framework through a null-space projection imitation reward mechanism. This enables efficient modeling and exploitation of configuration-space constraints, significantly improving trajectory optimality, computational efficiency, and practical feasibility. Extensive simulations and real-world experiments on a seven-degree-of-freedom manipulator demonstrate that the proposed method outperforms three baseline approaches in path-following tasks.
📝 Abstract
Trajectory optimization (TO) is an efficient tool to generate a redundant manipulator's joint trajectory following a 6-dimensional Cartesian path. The optimization performance largely depends on the quality of initial trajectories. However, the selection of a high-quality initial trajectory is non-trivial and requires a considerable time budget due to the extremely large space of the solution trajectories and the lack of prior knowledge about task constraints in configuration space. To alleviate the issue, we present a learning-based initial trajectory generation method that generates high-quality initial trajectories in a short time budget by adopting example-guided reinforcement learning. In addition, we suggest a null-space projected imitation reward to consider null-space constraints by efficiently learning kinematically feasible motion captured in expert demonstrations. Our statistical evaluation in simulation shows the improved optimality, efficiency, and applicability of TO when we plug in our method's output, compared with three other baselines. We also show the performance improvement and feasibility via real-world experiments with a seven-degree-of-freedom manipulator.