ManeuverNet: A Soft Actor-Critic Framework for Precise Maneuvering of Double-Ackermann-Steering Robots with Optimized Reward Functions

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of motion planning for dual-Ackermann steering robots in confined spaces, where conventional planners suffer from sensitivity to parameter tuning and existing end-to-end deep reinforcement learning (DRL) approaches often fail to converge due to poorly designed reward functions. To overcome these limitations, the authors propose ManeuverNet, the first DRL framework specifically tailored for dual-Ackermann systems, integrating Soft Actor-Critic with CrossQ and introducing four novel reward functions explicitly optimized for nonholonomic constraints. The approach enables autonomous maneuver learning without expert demonstrations or manual intervention. Experimental results demonstrate that ManeuverNet improves success rates by over 40% compared to DRL baselines and achieves up to 90% higher trajectory efficiency in real-world scenarios, significantly enhancing policy generalization and robustness while effectively mitigating the parameter sensitivity inherent in planners such as TEB.

Technology Category

Application Category

📝 Abstract
Autonomous control of double-Ackermann-steering robots is essential in agricultural applications, where robots must execute precise and complex maneuvers within a limited space. Classical methods, such as the Timed Elastic Band (TEB) planner, can address this problem, but they rely on parameter tuning, making them highly sensitive to changes in robot configuration or environment and impractical to deploy without constant recalibration. At the same time, end-to-end deep reinforcement learning (DRL) methods often fail due to unsuitable reward functions for non-holonomic constraints, resulting in sub-optimal policies and poor generalization. To address these challenges, this paper presents ManeuverNet, a DRL framework tailored for double-Ackermann systems, combining Soft Actor-Critic with CrossQ. Furthermore, ManeuverNet introduces four specifically designed reward functions to support maneuver learning. Unlike prior work, ManeuverNet does not depend on expert data or handcrafted guidance. We extensively evaluate ManeuverNet against both state-of-the-art DRL baselines and the TEB planner. Experimental results demonstrate that our framework substantially improves maneuverability and success rates, achieving more than a 40% gain over DRL baselines. Moreover, ManeuverNet effectively mitigates the strong parameter sensitivity observed in the TEB planner. In real-world trials, ManeuverNet achieved up to a 90% increase in maneuvering trajectory efficiency, highlighting its robustness and practical applicability.
Problem

Research questions and friction points this paper is trying to address.

double-Ackermann-steering
precise maneuvering
reward function design
parameter sensitivity
non-holonomic constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Double-Ackermann steering
Soft Actor-Critic
Reward function design
Deep reinforcement learning
Autonomous maneuvering
🔎 Similar Papers
No similar papers found.
K
Kohio Deflesselle
Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, F-33400 Talence, France
M
Mélodie Daniel
Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, F-33400 Talence, France
A
Aly Magassouba
School of Computer Science, University of Nottingham, UK
M
Miguel Aranda
Instituto de Investigación en Ingeniería de Aragón (I3A), Universidad de Zaragoza, 50018 Zaragoza, Spain
Olivier Ly
Olivier Ly
Associate Professor - LaBRI - Université de Bordeaux