🤖 AI Summary
In dynamic, partially observable environments, autonomous navigation faces challenges including poor coordination between path planning and motion control, low safety guarantees, and excessive control jitter. To address these issues, this paper proposes a hierarchical reinforcement learning framework: a high-level discrete task planner based on Deep Q-Networks (DQN) and a low-level continuous-action controller implemented via Twin Delayed Deep Deterministic Policy Gradient (TD3). We further introduce a LiDAR-driven safety gate mechanism and a multi-component reward shaping strategy to enhance obstacle avoidance robustness and action smoothness. Evaluated in ROS/Gazebo simulations using the PathBench benchmark, our method achieves significant improvements—increasing navigation success rate by +21.3%, reducing training steps by 37%, decreasing collision rate by −42.6%, and lowering control discontinuity frequency by −58.1%. Moreover, it demonstrates strong generalization to unseen obstacle configurations.
📝 Abstract
This paper presents a hierarchical path-planning and control framework that combines a high-level Deep Q-Network (DQN) for discrete sub-goal selection with a low-level Twin Delayed Deep Deterministic Policy Gradient (TD3) controller for continuous actuation. The high-level module selects behaviors and sub-goals; the low-level module executes smooth velocity commands. We design a practical reward shaping scheme (direction, distance, obstacle avoidance, action smoothness, collision penalty, time penalty, and progress), together with a LiDAR-based safety gate that prevents unsafe motions. The system is implemented in ROS + Gazebo (TurtleBot3) and evaluated with PathBench metrics, including success rate, collision rate, path efficiency, and re-planning efficiency, in dynamic and partially observable environments. Experiments show improved success rate and sample efficiency over single-algorithm baselines (DQN or TD3 alone) and rule-based planners, with better generalization to unseen obstacle configurations and reduced abrupt control changes. Code and evaluation scripts are available at the project repository.