Stochastic Trajectory Optimization for Robotic Skill Acquisition From a Suboptimal Demonstration

📅 2024-08-06

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

To address low imitation efficiency in robot learning from suboptimal human demonstrations—characterized by poor dynamic performance (e.g., low velocity, high jitter)—this paper proposes MSTOMP, a multi-strategy trajectory optimization framework. Its core contributions are: (1) the first integration of frequency-domain denoising with a novel MSES (Modified Spectral Error Similarity) metric to preserve trajectory shape fidelity; (2) a unified time–frequency domain modeling theory that tightly couples DTW-based temporal alignment, frequency-domain gain control, and STOMP-based optimization; and (3) multi-strategy sampling and spectral error modeling to enhance robustness against demonstration noise. Evaluated in both simulation and real-world robotic arm experiments, MSTOMP achieves significantly faster convergence (2.3× on average), improved optimization stability, and superior dynamic execution quality—outperforming state-of-the-art Learning-from-Demonstration (LfD) trajectory optimization methods.

Technology Category

Application Category

📝 Abstract

Learning from Demonstration (LfD) has emerged as a crucial method for robots to acquire new skills. However, when given suboptimal task trajectory demonstrations with shape characteristics reflecting human preferences but subpar dynamic attributes such as slow motion, robots not only need to mimic the behaviors but also optimize the dynamic performance. In this work, we leverage optimization-based methods to search for a superior-performing trajectory whose shape is similar to that of the demonstrated trajectory. Specifically, we use Dynamic Time Warping (DTW) to quantify the difference between two trajectories and combine it with additional performance metrics, such as collision cost, to construct the cost function. Moreover, we develop a multi-policy version of the Stochastic Trajectory Optimization for Motion Planning (STOMP), called MSTOMP, which is more stable and robust to parameter changes. To deal with the jitter in the demonstrated trajectory, we further utilize the gain-controlling method in the frequency domain to denoise the demonstration and propose a computationally more efficient metric, called Mean Square Error in the Spectrum (MSES), that measures the trajectories' differences in the frequency domain. We also theoretically highlight the connections between the time domain and the frequency domain methods. Finally, we verify our method in both simulation experiments and real-world experiments, showcasing its improved optimization performance and stability compared to existing methods. The source code can be found at https://ming-bot.github.io/MSTOMP.github.io.

Problem

Research questions and friction points this paper is trying to address.

Optimizing suboptimal robotic trajectories from human demonstrations

Enhancing dynamic performance while preserving trajectory shape

Denoising and efficiently measuring trajectory differences in frequency domain

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses DTW for trajectory shape similarity

Develops MSTOMP for robust optimization

Proposes MSES for frequency domain analysis

🔎 Similar Papers

Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization