1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization

📅 2024-10-23
🏛️ International Conference on Verification, Model Checking and Abstract Interpretation
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of policy synthesis in parametric Markov decision processes (pMDPs), where state-space explosion hinders scalability, this paper proposes a decision-tree-based cross-scale policy generalization method. It first synthesizes optimal policy samples on small-scale instances via probabilistic model checking, then induces the underlying state-action mapping rules using decision trees, and directly deploys the learned policy to ultra-large-scale pMDPs—without explicitly unfolding their state spaces. This work constitutes the first systematic integration of decision-tree learning into pMDP policy synthesis, unifying probabilistic verification, symbolic induction, and parametric generalization. Experimental evaluation on standard benchmarks demonstrates both high policy quality and unprecedented scalability: the approach successfully handles pMDP instances orders of magnitude larger than those tractable by current state-of-the-art tools.

Technology Category

Application Category

📝 Abstract
Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such emph{huge} MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.
Problem

Research questions and friction points this paper is trying to address.

Synthesizing policies for huge parameterized MDPs
Generalizing small instance policies to larger MDPs
Addressing state-space explosion in probabilistic model checking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decision-tree learning for policy generalization
Bypassing explicit large state-space exploration
Generalizing small instance policies to large MDPs
🔎 Similar Papers
No similar papers found.
M
Muqsit Azeem
Technical University of Munich, Munich, Germany
D
Debraj Chakraborty
Masaryk University, Brno, Czech Republic
S
Sudeep Kanav
Masaryk University, Brno, Czech Republic
J
Jan Kretínský
Technical University of Munich, Munich, Germany; Masaryk University, Brno, Czech Republic
M
MohammadSadegh Mohagheghi
Vali-e-Asr University of Rafsanjan, Rafsanjan, Iran
S
Stefanie Mohr
Technical University of Munich, Munich, Germany
Maximilian Weininger
Maximilian Weininger
Ruhr University Bochum
Probabilistic verificationgame theoryexplainable controllers