🤖 AI Summary
This work addresses the conflict between safety and exploration for unmanned aerial vehicles operating in dynamic, high-risk environments, where sensor noise and uncertainty in obstacle intentions pose significant challenges. To tackle this issue, the authors propose a safe navigation framework grounded in constrained Markov decision processes (CMDPs). The approach introduces an adaptive trajectory relation evolution mechanism (ATREM) to model long-term obstacle intentions and incorporates a physics-aware gated Kalman filter (PAG-KF) to suppress non-stationary observation noise. Safety constraints and maneuverability are jointly optimized through Lagrangian dual optimization, enabling coordinated policy learning. Experimental results demonstrate that the proposed method significantly improves obstacle avoidance success rates, reduces energy consumption, and generates smoother flight trajectories in highly dynamic threat scenarios.
📝 Abstract
Deep reinforcement learning (DRL) finds extensive application in autonomous drone navigation within complex, high-risk environments. However, its practical deployment faces a safety-exploration dilemma: soft penalty mechanisms encourage risky trial-and-error, while most constraint-based methods suffer degraded performance under sensor noise and intent uncertainty. We propose Dynamic-TD3, a physically enhanced framework that enforces strict safety constraints while maintaining maneuverability by modeling navigation as a Constrained Markov Decision Process (CMDP). This framework integrates an Adaptive Trajectory Relational Evolution Mechanism (ATREM) to capture long-range intentions and employs a Physically Aware Gated Kalman Filter (PAG-KF) to mitigate non-stationary observation noise. The resulting state representation drives a dual-criterion policy that balances mission efficiency against hard safety constraints via Lagrangian relaxation. In experiments with aggressive dynamic threats, this approach demonstrates superior collision avoidance performance, reduced energy consumption, and smoother flight trajectories.