🤖 AI Summary
This paper addresses the lack of standardized benchmarks for stochastic model verification and policy synthesis in the ARCH-COMP’25 Friendly Competition.
Method: We propose a novel evaluation framework featuring (i) the first scalable benchmark suite for water distribution systems, accompanied by a curated set of simplified instances; (ii) unified integration of probabilistic model checking (for MDPs and CTMCs), stochastic control theory, and formal specifications expressed in Signal Temporal Logic (STL) and Probabilistic Temporal Logic (PTL); and (iii) development and integration of three new analysis tools.
Contributions/Results: We establish the first open-source benchmark library for stochastic model verification competitions; enable fair, reproducible, cross-team performance evaluation across seven international teams; and significantly improve comparability and reproducibility of verification results across diverse tools—thereby providing a scalable methodological foundation for future editions of the competition.
📝 Abstract
This report is concerned with a friendly competition for formal verification and policy synthesis of stochastic models. The main goal of the report is to introduce new benchmarks and their properties within this category and recommend next steps toward next year's edition of the competition. In particular, this report introduces three recently developed software tools, a new water distribution network benchmark, and a collection of simplified benchmarks intended to facilitate further comparisons among tools that were previously not directly comparable. This friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in Summer 2025.