🤖 AI Summary
To address unreliable safety-critical decision-making in autonomous vehicles at unsignalized intersections—caused by perceptual limitations and environmental stochasticity—this paper proposes a risk-aware dynamic decision-making and control framework integrating uncertainty quantification with safety constraints. Methodologically, it synergistically combines ensemble distributional reinforcement learning with an uncertainty-driven high-order control barrier function (HOCBF)-based safety filtering mechanism, enabling online, adaptive switching between safety-guaranteed and flexibility-oriented policies. The approach jointly models state and model uncertainties and synthesizes real-time, safety-feasible control inputs via HOCBF. Evaluated in multi-agent unsignalized intersection simulations, the method reduces average collision rate by 62% relative to baseline approaches while maintaining traffic throughput above 98.3%, demonstrating substantial improvements in both safety and practicality under complex interactive scenarios.
📝 Abstract
Reinforcement learning (RL) has demonstrated potential in autonomous driving (AD) decision tasks. However, applying RL to urban AD, particularly in intersection scenarios, still faces significant challenges. The lack of safety constraints makes RL vulnerable to risks. Additionally, cognitive limitations and environmental randomness can lead to unreliable decisions in safety-critical scenarios. Therefore, it is essential to quantify confidence in RL decisions to improve safety. This paper proposes an Uncertainty-aware Safety-Critical Decision and Control (USDC) framework, which generates a risk-averse policy by constructing a risk-aware ensemble distributional RL, while estimating uncertainty to quantify the policy's reliability. Subsequently, a high-order control barrier function (HOCBF) is employed as a safety filter to minimize intervention policy while dynamically enhancing constraints based on uncertainty. The ensemble critics evaluate both HOCBF and RL policies, embedding uncertainty to achieve dynamic switching between safe and flexible strategies, thereby balancing safety and efficiency. Simulation tests on unsignalized intersections in multiple tasks indicate that USDC can improve safety while maintaining traffic efficiency compared to baselines.