๐ค AI Summary
This study addresses the inherent trade-off between probabilistic safety and permissiveness in Markov decision processes. Recognizing the theoretical limitation that strong probabilistic safety and maximal permissiveness cannot be simultaneously achieved, the work proposes a novel shielding mechanism that guarantees strong probabilistic safety while ensuring weak permissiveness. It formally establishes, for the first time, the incompatibility between strong safety and maximal permissiveness under probabilistic safety constraints. Building on this insight, the authors develop both offline and online shield synthesis algorithms that offer rigorous theoretical guarantees alongside practical applicability. Experimental results demonstrate that the proposed approach achieves high safety probabilities with favorable computational efficiency and deployment advantages in real-world scenarios.
๐ Abstract
Shielding is a prominent model-based technique to ensure safety of autonomous agents. Classical shielding aims to ensure that nothing bad ever happens and comes with strong guarantees about safety and maximal permissiveness. However, shielding systems for probabilistic safety, where something bad is allowed to happen with an acceptable probability, has proven to be more intricate. This paper presents a formal framework that conservatively extends classical shields to probabilistic safety. In this framework, we (i) demonstrate the impossibility of preserving the strong guarantees on safety and permissiveness, (ii) provide natural shields with weaker guarantees, and (iii) introduce offline and online shield constructions ensuring strong safety guarantees. The empirical evaluation highlights the practical advantages of the new shields, as well as their computational feasibility.