🤖 AI Summary
Safety shielding in infinite-state systems is often unrealizable due to inherent contradictions within safety specifications, yet existing approaches lack explainable, formal diagnostics for such unrealizability.
Method: We propose the first explanation-generation framework grounded in temporal logic formula expansion, which automatically derives both conditional and unconditional logical contradictions responsible for shielding failure—enabling formal attribution of unrealizability. Our approach integrates formal verification with symbolic reasoning, avoiding explicit state-space enumeration.
Contribution/Results: Experiments demonstrate that the framework accurately identifies diverse inconsistency patterns across specification variants, exhibiting robustness and flexibility. It significantly enhances debuggability and design transparency of shielding mechanisms in safety-critical reinforcement learning, offering principled, human-interpretable justifications for shielding failures.
📝 Abstract
Safe Reinforcement Learning focuses on developing optimal policies while ensuring safety. A popular method to address such task is shielding, in which a correct-by-construction safety component is synthesized from logical specifications. Recently, shield synthesis has been extended to infinite-state domains, such as continuous environments. This makes shielding more applicable to realistic scenarios. However, often shields might be unrealizable because the specification is inconsistent (e.g., contradictory). In order to address this gap, we present a method to obtain simple unconditional and conditional explanations that witness unrealizability, which goes by temporal formula unrolling. In this paper, we show different variants of the technique and its applicability.