๐ค AI Summary
This paper addresses the challenge of dynamically aligning probabilistic system models with their actual runtime behavior. Methodologically, it proposes a lightweight, real-time alignment monitoring framework featuring an online-computable alignment score, novel differential alignment monitoring (to detect local misalignment trends), and weighted alignment monitoring (to support task-specific customization and model comparison). The monitor is built upon sequential prediction, integrating probabilistic forecasts, distributional similarity metrics (e.g., Wasserstein distance), and high-confidence interval estimation for runtime assessment. Experiments on the PRISM benchmark demonstrate that the monitor incurs low memory overhead, responds rapidly, and effectively detects modelโreality misalignment with high accuracy and strong real-time performance. This work establishes a new paradigm for trustworthy verification of probabilistic systems.
๐ Abstract
Formal verification provides assurances that a probabilistic system satisfies its specification--conditioned on the system model being aligned with reality. We propose alignment monitoring to watch that this assumption is justified. We consider a probabilistic model well aligned if it accurately predicts the behaviour of an uncertain system in advance. An alignment score measures this by quantifying the similarity between the model's predicted and the system's (unknown) actual distributions. An alignment monitor observes the system at runtime; at each point in time it uses the current state and the model to predict the next state. After the next state is observed, the monitor updates the verdict, which is a high-probability interval estimate for the true alignment score. We utilize tools from sequential forecasting to construct our alignment monitors. Besides a monitor for measuring the expected alignment score, we introduce a differential alignment monitor, designed for comparing two models, and a weighted alignment monitor, which permits task-specific alignment monitoring. We evaluate our monitors experimentally on the PRISM benchmark suite. They are fast, memory-efficient, and detect misalignment early.