Hybrid DQN-TD3 Reinforcement Learning for Autonomous Navigation in Dynamic Environments

📅 2025-10-30

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

In dynamic, partially observable environments, autonomous navigation faces challenges including poor coordination between path planning and motion control, low safety guarantees, and excessive control jitter. To address these issues, this paper proposes a hierarchical reinforcement learning framework: a high-level discrete task planner based on Deep Q-Networks (DQN) and a low-level continuous-action controller implemented via Twin Delayed Deep Deterministic Policy Gradient (TD3). We further introduce a LiDAR-driven safety gate mechanism and a multi-component reward shaping strategy to enhance obstacle avoidance robustness and action smoothness. Evaluated in ROS/Gazebo simulations using the PathBench benchmark, our method achieves significant improvements—increasing navigation success rate by +21.3%, reducing training steps by 37%, decreasing collision rate by −42.6%, and lowering control discontinuity frequency by −58.1%. Moreover, it demonstrates strong generalization to unseen obstacle configurations.

Technology Category

Application Category

📝 Abstract

This paper presents a hierarchical path-planning and control framework that combines a high-level Deep Q-Network (DQN) for discrete sub-goal selection with a low-level Twin Delayed Deep Deterministic Policy Gradient (TD3) controller for continuous actuation. The high-level module selects behaviors and sub-goals; the low-level module executes smooth velocity commands. We design a practical reward shaping scheme (direction, distance, obstacle avoidance, action smoothness, collision penalty, time penalty, and progress), together with a LiDAR-based safety gate that prevents unsafe motions. The system is implemented in ROS + Gazebo (TurtleBot3) and evaluated with PathBench metrics, including success rate, collision rate, path efficiency, and re-planning efficiency, in dynamic and partially observable environments. Experiments show improved success rate and sample efficiency over single-algorithm baselines (DQN or TD3 alone) and rule-based planners, with better generalization to unseen obstacle configurations and reduced abrupt control changes. Code and evaluation scripts are available at the project repository.

Problem

Research questions and friction points this paper is trying to address.

Develops hybrid reinforcement learning for autonomous robot navigation in dynamic environments

Combines high-level DQN planning with low-level TD3 control for smooth motion execution

Addresses safety and efficiency challenges in partially observable environments with obstacles

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid DQN-TD3 hierarchical framework for navigation

LiDAR safety gate prevents unsafe robot motions

Reward shaping combines multiple navigation objectives

🔎 Similar Papers

No similar papers found.