Trust, Geometry, and Rules: A Credibility-Aware Reinforcement Learning Framework for Safe USV Navigation under Uncertainty

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of ensuring both COLREGs compliance and safe navigation for unmanned surface vehicles under perceptual uncertainty. To this end, the authors propose a reinforcement learning framework that integrates credibility-aware learning, geometric safety shielding, and continuous rule embedding. The approach employs credibility-weighted value learning to dynamically modulate policy updates, utilizes a covariance-inflated velocity obstacle (CI-VO) method to map localization uncertainty into conservative collision-avoidance boundaries, and transforms discrete navigation rules into continuous duty signals to enhance policy stability. Experimental results demonstrate that the proposed method significantly improves training robustness and achieves superior collision avoidance performance and COLREGs adherence in high-uncertainty scenarios.
📝 Abstract
Autonomous navigation of Unmanned Surface Vehicles (USVs) that is safe and compliant with the International Regulations for Preventing Collisions at Sea (COLREGs) remains a formidable challenge in dynamic maritime environments, particularly when perception systems exhibit miscalibrated uncertainty. Existing Reinforcement Learning (RL)-based methods often falter because state-estimation errors induce unreliable belief states that mislead the value function, while discrete traffic rules introduce discontinuity in the learning objective. To address these challenges, we propose a framework integrating credibility-aware learning, geometric safety shielding, and continuous rule-aware embedding. First, Credibility-Weighted Value Learning (CW-VL) introduces a dynamic trust factor derived from the discrepancy between filter-estimated covariance and empirical error statistics to modulate the critic's heteroscedastic loss, preventing policy overfitting to noisy samples. Second, the Covariance-Inflated Velocity Obstacle (CI-VO) maps position-estimation uncertainty into set-wise angular margins, forming a conservative geometric shield that overrides hazardous exploratory actions. Third, Risk-Aware COLREGs Duty Embedding relaxes binary encounter duties into continuous rule-aware signals, providing smooth sector-transition information and suppressing oscillation from sparse rule rewards. Simulated encounter studies demonstrate improved training robustness against perceptual inconsistency and superior collision avoidance and COLREGs compliance over baselines.
Problem

Research questions and friction points this paper is trying to address.

Unmanned Surface Vehicles
Reinforcement Learning
COLREGs compliance
Perception uncertainty
Safe navigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Credibility-Aware Reinforcement Learning
Covariance-Inflated Velocity Obstacle
Risk-Aware COLREGs Embedding
Heteroscedastic Value Learning
Geometric Safety Shielding
Y
Yuhang Zhang
School of Information Engineering, Henan University of Science and Technology, Luoyang 471023, China
S
Shuqi Chai
Shenzhen Research Institute of Big Data, Shenzhen 518172, China
Yukang Zhang
Yukang Zhang
Xiamen University
L
Liusha Yang
Shenzhen Technology University, Shenzhen 518118, China
M
Mingchuan Zhang
School of Information Engineering, Henan University of Science and Technology, Luoyang 471023, China
W
Wei Wang
School of Computer Science, Wuhan University, Wuhan 430072, China
Q
Qingjiang Shi
School of Software Engineering, Tongji University, Shanghai 201804, China
Q
Quanbo Ge
School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China