🤖 AI Summary
To address the challenge of generating smooth, near-optimal, and collision-free 3D Cartesian trajectories from a single human demonstration, this paper proposes an obstacle-avoidance method integrating Dynamic Movement Primitives (DMPs) with Policy Improvement by Path Integrals (PI²), a model-free policy gradient reinforcement learning algorithm. Leveraging point-cloud-based perceptual feature extraction and neural-network-driven parameter mapping, the method uses the single demonstration as an initialization seed to iteratively refine DMP parameters, efficiently constructing a diverse trajectory dataset and enabling multimodal trajectory generation. Evaluations in simulation and on real robotic platforms demonstrate that, compared to RRT-Connect, our approach significantly reduces planning and execution time, shortens trajectory length, and exhibits strong generalization across varying obstacle configurations and end-effector sizes. The core contribution is a high-quality, generalizable, and real-time feasible 3D motion planner operating under low-sample-cost constraints.
📝 Abstract
Learning-based motion planning can quickly generate near-optimal trajectories. However, it often requires either large training datasets or costly collection of human demonstrations. This work proposes an alternative approach that quickly generates smooth, near-optimal collision-free 3D Cartesian trajectories from a single artificial demonstration. The demonstration is encoded as a Dynamic Movement Primitive (DMP) and iteratively reshaped using policy-based reinforcement learning to create a diverse trajectory dataset for varying obstacle configurations. This dataset is used to train a neural network that takes as inputs the task parameters describing the obstacle dimensions and location, derived automatically from a point cloud, and outputs the DMP parameters that generate the trajectory. The approach is validated in simulation and real-robot experiments, outperforming a RRT-Connect baseline in terms of computation and execution time, as well as trajectory length, while supporting multi-modal trajectory generation for different obstacle geometries and end-effector dimensions. Videos and the implementation code are available at https://github.com/DominikUrbaniak/obst-avoid-dmp-pi2.