🤖 AI Summary
To address the challenge of efficient autonomous navigation for wheel loaders toward arbitrary target poses, this paper proposes an RL-MPC深度融合 framework. Instead of conventional hierarchical architectures, it pioneers the direct integration of a differentiable Critic network—trained via Actor-Critic reinforcement learning—into model predictive control (MPC) as analytically differentiable stage and terminal cost functions. This design tightly couples end-to-end learning-based planning with model-driven control, achieving millisecond-level online replanning while significantly improving navigation timeliness and robustness for highly nonlinear systems. The method synergistically combines deep reinforcement learning, neural function approximation, and hardware-in-the-loop deployment. Comprehensive simulations demonstrate superior performance over classical trajectory optimization methods, and real-vehicle experiments across diverse operational scenarios consistently achieve stable and precise navigation.
📝 Abstract
This paper proposes a novel control method for an autonomous wheel loader, enabling time-efficient navigation to an arbitrary goal pose. Unlike prior works which combine high-level trajectory planners with Model Predictive Control (MPC), we directly enhance the planning capabilities of MPC by incorporating a cost function derived from Actor-Critic Reinforcement Learning (RL). Specifically, we first train an RL agent to solve the pose reaching task in simulation, then transfer the learned planning knowledge to an MPC by incorporating the trained neural network critic as both the stage and terminal cost. We show through comprehensive simulations that the resulting MPC inherits the time-efficient behavior of the RL agent, generating trajectories that compare favorably against those found using trajectory optimization. We also deploy our method on a real-world wheel loader, where we demonstrate successful navigation in various scenarios.