Explanations for Unrealizability of Infinite-State Safety Shields

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Safety shielding in infinite-state systems is often unrealizable due to inherent contradictions within safety specifications, yet existing approaches lack explainable, formal diagnostics for such unrealizability. Method: We propose the first explanation-generation framework grounded in temporal logic formula expansion, which automatically derives both conditional and unconditional logical contradictions responsible for shielding failure—enabling formal attribution of unrealizability. Our approach integrates formal verification with symbolic reasoning, avoiding explicit state-space enumeration. Contribution/Results: Experiments demonstrate that the framework accurately identifies diverse inconsistency patterns across specification variants, exhibiting robustness and flexibility. It significantly enhances debuggability and design transparency of shielding mechanisms in safety-critical reinforcement learning, offering principled, human-interpretable justifications for shielding failures.

Technology Category

Application Category

📝 Abstract

Safe Reinforcement Learning focuses on developing optimal policies while ensuring safety. A popular method to address such task is shielding, in which a correct-by-construction safety component is synthesized from logical specifications. Recently, shield synthesis has been extended to infinite-state domains, such as continuous environments. This makes shielding more applicable to realistic scenarios. However, often shields might be unrealizable because the specification is inconsistent (e.g., contradictory). In order to address this gap, we present a method to obtain simple unconditional and conditional explanations that witness unrealizability, which goes by temporal formula unrolling. In this paper, we show different variants of the technique and its applicability.

Problem

Research questions and friction points this paper is trying to address.

Explaining unrealizability of infinite-state safety shields

Addressing inconsistent specifications in shield synthesis

Providing unconditional and conditional unrealizability explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shield synthesis for infinite-state domains

Temporal formula unrolling for explanations

Unconditional and conditional unrealizability witnesses

🔎 Similar Papers

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?