🤖 AI Summary
This work addresses the challenge of efficient navigation in crowded environments with highly uncertain obstacle motion, where existing methods often sacrifice performance due to overly conservative interaction policies. We propose the first end-to-end risk-adaptive navigation framework that integrates a differentiable Conditional Value-at-Risk (CVaR) barrier function into reinforcement learning. Our approach models motion uncertainty using Gaussian mixture models and jointly optimizes nominal control inputs, risk levels, and safety margins through a differentiable quadratic programming safety layer, explicitly enforcing probabilistic safety constraints. By enabling context-aware risk modulation, the method significantly outperforms state-of-the-art optimization-based, reinforcement learning, and hybrid approaches across diverse dynamic crowded scenarios—including out-of-distribution settings—while simultaneously improving navigation efficiency, safety guarantees, and generalization capability.
📝 Abstract
Planning through crowded environments under uncertain obstacle motions remains difficult, as stochastic interactions often induce overly conservative behavior or reduced efficiency. To address this challenge, we propose an end-to-end risk adaptation framework for crowd navigation under obstacle-motion uncertainty modeled by a Gaussian mixture model. The framework combines reinforcement learning~(RL) with a differentiable quadratic-program safety layer based on Conditional Value-at-Risk~(CVaR) barrier functions, jointly learning nominal control input, risk level, and safety margin and enforcing explicit probabilistic safety constraints. This design enables context-aware adaptation, promoting efficient behavior while invoking caution only when necessary. We conduct extensive evaluations in dynamic, uncertain, and crowded environments across varying obstacle densities and robot models, and further assess generalization under three out-of-distribution cases. Comparisons across optimization-based, RL-based, and integrated RL and optimization methods are provided, and the proposed method is shown to deliver the strongest overall performance in safety, efficiency, and generalization under uncertainty.