Trust, Geometry, and Rules: A Credibility-Aware Reinforcement Learning Framework for Safe USV Navigation under Uncertainty

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the challenge of ensuring both COLREGs compliance and safe navigation for unmanned surface vehicles under perceptual uncertainty. To this end, the authors propose a reinforcement learning framework that integrates credibility-aware learning, geometric safety shielding, and continuous rule embedding. The approach employs credibility-weighted value learning to dynamically modulate policy updates, utilizes a covariance-inflated velocity obstacle (CI-VO) method to map localization uncertainty into conservative collision-avoidance boundaries, and transforms discrete navigation rules into continuous duty signals to enhance policy stability. Experimental results demonstrate that the proposed method significantly improves training robustness and achieves superior collision avoidance performance and COLREGs adherence in high-uncertainty scenarios.

📝 Abstract

Autonomous navigation of Unmanned Surface Vehicles (USVs) that is safe and compliant with the International Regulations for Preventing Collisions at Sea (COLREGs) remains a formidable challenge in dynamic maritime environments, particularly when perception systems exhibit miscalibrated uncertainty. Existing Reinforcement Learning (RL)-based methods often falter because state-estimation errors induce unreliable belief states that mislead the value function, while discrete traffic rules introduce discontinuity in the learning objective. To address these challenges, we propose a framework integrating credibility-aware learning, geometric safety shielding, and continuous rule-aware embedding. First, Credibility-Weighted Value Learning (CW-VL) introduces a dynamic trust factor derived from the discrepancy between filter-estimated covariance and empirical error statistics to modulate the critic's heteroscedastic loss, preventing policy overfitting to noisy samples. Second, the Covariance-Inflated Velocity Obstacle (CI-VO) maps position-estimation uncertainty into set-wise angular margins, forming a conservative geometric shield that overrides hazardous exploratory actions. Third, Risk-Aware COLREGs Duty Embedding relaxes binary encounter duties into continuous rule-aware signals, providing smooth sector-transition information and suppressing oscillation from sparse rule rewards. Simulated encounter studies demonstrate improved training robustness against perceptual inconsistency and superior collision avoidance and COLREGs compliance over baselines.

Problem

Research questions and friction points this paper is trying to address.

Unmanned Surface Vehicles

Reinforcement Learning

COLREGs compliance

Perception uncertainty

Safe navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Credibility-Aware Reinforcement Learning

Covariance-Inflated Velocity Obstacle

Risk-Aware COLREGs Embedding