Autonomous Wheel Loader Navigation Using Goal-Conditioned Actor-Critic MPC

📅 2024-09-24

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the challenge of efficient autonomous navigation for wheel loaders toward arbitrary target poses, this paper proposes an RL-MPC深度融合 framework. Instead of conventional hierarchical architectures, it pioneers the direct integration of a differentiable Critic network—trained via Actor-Critic reinforcement learning—into model predictive control (MPC) as analytically differentiable stage and terminal cost functions. This design tightly couples end-to-end learning-based planning with model-driven control, achieving millisecond-level online replanning while significantly improving navigation timeliness and robustness for highly nonlinear systems. The method synergistically combines deep reinforcement learning, neural function approximation, and hardware-in-the-loop deployment. Comprehensive simulations demonstrate superior performance over classical trajectory optimization methods, and real-vehicle experiments across diverse operational scenarios consistently achieve stable and precise navigation.

Technology Category

Application Category

📝 Abstract

This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works which combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by incorporating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we first train an RL agent to solve the pose reaching task in simulation, then transfer the learned planning knowledge to an MPC by incorporating the trained neural network critic as both the stage and terminal cost. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real-world wheel loader, where we demonstrate successful navigation in various scenarios.

Problem

Research questions and friction points this paper is trying to address.

Enabling autonomous wheel loader navigation to arbitrary goal poses

Enhancing MPC planning with Actor-Critic RL cost function

Transferring RL-learned planning knowledge to real-world wheel loader

Innovation

Methods, ideas, or system contributions that make the work stand out.

Goal-Conditioned Actor-Critic MPC for navigation

RL-trained critic as MPC cost function

Deployed on real-world autonomous wheel loader

🔎 Similar Papers

No similar papers found.