Interval POMDP Shielding for Imperfect-Perception Agents

📅 2026-04-22

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Autonomous systems often make unsafe decisions under perception uncertainty. To address this, this work proposes a novel approach that leverages limited labeled data to estimate confidence intervals of perception outputs and, for the first time, incorporates such interval-based perception uncertainty into a Partially Observable Markov Decision Process (POMDP), yielding an Interval POMDP model. Building upon this formulation, the authors design an online belief update algorithm that generates a conservative belief set consistent with observations and integrates a runtime shielding mechanism to provide formal safety guarantees with high probability over a finite time horizon. Experimental results across four benchmark scenarios demonstrate that the proposed method significantly outperforms existing baselines, effectively enhancing system safety.

Technology Category

Application Category

📝 Abstract

Autonomous systems that rely on learned perception can make unsafe decisions when sensor readings are misclassified. We study shielding for this setting: given a proposed action, a shield blocks actions that could violate safety. We consider the common case where system dynamics are known but perception uncertainty must be estimated from finite labeled data. From these data we build confidence intervals for the probabilities of perception outcomes and use them to model the system as a finite Interval Partially Observable Markov Decision Process with discrete states and actions. We then propose an algorithm to compute a conservative set of beliefs over the underlying state that is consistent with the observations seen so far. This enables us to construct a runtime shield that comes with a finite-horizon guarantee: with high probability over the training data, if the true perception uncertainty rates lie within the learned intervals, then every action admitted by the shield satisfies a stated lower bound on safety. Experiments on four case studies show that our shielding approach (and variants derived from it) improves the safety of the system over state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

shielding

imperfect perception

safety guarantee

Interval POMDP

perception uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interval POMDP

Shielding

Perception Uncertainty