Robust Policy Search for Robot Navigation

📅 2020-03-02
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
To address low data efficiency and poor robustness in robot navigation policy learning under uncertainty, this paper proposes a Bayesian reinforcement learning framework integrating robust optimization with statistical robustness. Methodologically, it introduces (1) an unscented Bayesian optimization algorithm to ensure policy safety and reproducibility, and (2) a Boltzmann-based stochastic acquisition function coupled with an adaptive Gaussian process surrogate model to jointly enhance convergence and robustness against modeling errors. Evaluated on multiple benchmark functions and real-world legged locomotion tasks, the method achieves a 32% improvement in policy success rate, demonstrates significantly enhanced robustness to input perturbations and model mismatch, and substantially increases sample efficiency compared to existing approaches.
📝 Abstract
Complex robot navigation and control problems can be framed as policy search problems. However, interactive learning in uncertain environments can be expensive, requiring the use of data-efficient methods. Bayesian optimization is an efficient nonlinear optimization method where queries are carefully selected to gather information about the optimum location. This is achieved by a surrogate model, which encodes past information, and the acquisition function for query selection. Bayesian optimization can be very sensitive to uncertainty in the input data or prior assumptions. In this letter, we incorporate both robust optimization and statistical robustness, showing that both types of robustness are synergistic. For robust optimization we use an improved version of unscented Bayesian optimization which provides safe and repeatable policies in the presence of policy uncertainty. We also provide new theoretical insights. For statistical robustness, we use an adaptive surrogate model and we introduce the Boltzmann selection as a stochastic acquisition method to have convergence guarantees and improved performance even with surrogate modeling errors. We present results in several optimization benchmarks and robot tasks.
Problem

Research questions and friction points this paper is trying to address.

Robot Navigation
Data Efficiency
Stability Control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Adaptive Modeling
Bayesian Optimization
🔎 Similar Papers
J
Javier Garcia-Barcos
Instituto de Investigacion en Ingenieria de Aragon (I3A), University of Zaragoza, Spain
Ruben Martinez-Cantin
Ruben Martinez-Cantin
Associate Professor, University of Zaragoza, Spain
Bayesian OptimizationMachine LearningRoboticsComputer Vision