๐ค AI Summary
This work addresses the challenges of interpretable risk assessment and robust decision-making in safe reinforcement learning under multiple sources of uncertainty in observations, actions, and dynamics. The authors propose Fuz-RL, a novel framework that introduces fuzzy measures and the Choquet integral into safe RL for the first time. They construct a Choquet integralโbased fuzzy Bellman operator to estimate robust value functions and reformulate constrained Markov decision processes (CMDPs) equivalently as distributionally robust optimization problems, thereby circumventing explicit min-max computations. Theoretical analysis establishes the equivalence of this approach to a distributionally robust safe RL formulation. Empirical evaluations on safe-control-gym and safety-gymnasium benchmarks demonstrate that Fuz-RL significantly outperforms existing methods, achieving superior safety guarantees and control performance while enhancing interpretability and computational efficiency.
๐ Abstract
Safe Reinforcement Learning (RL) is crucial for achieving high performance while ensuring safety in real-world applications. However, the complex interplay of multiple uncertainty sources in real environments poses significant challenges for interpretable risk assessment and robust decision-making. To address these challenges, we propose Fuz-RL, a fuzzy measure-guided robust framework for safe RL. Specifically, our framework develops a novel fuzzy Bellman operator for estimating robust value functions using Choquet integrals. Theoretically, we prove that solving the Fuz-RL problem (in Constrained Markov Decision Process (CMDP) form) is equivalent to solving distributionally robust safe RL problems (in robust CMDP form), effectively avoiding min-max optimization. Empirical analyses on safe-control-gym and safety-gymnasium scenarios demonstrate that Fuz-RL effectively integrates with existing safe RL baselines in a model-free manner, significantly improving both safety and control performance under various types of uncertainties in observation, action, and dynamics.