1-2-3-Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization

📅 2024-10-23

🏛️ International Conference on Verification, Model Checking and Abstract Interpretation

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Addressing the challenge of policy synthesis in parametric Markov decision processes (pMDPs), where state-space explosion hinders scalability, this paper proposes a decision-tree-based cross-scale policy generalization method. It first synthesizes optimal policy samples on small-scale instances via probabilistic model checking, then induces the underlying state-action mapping rules using decision trees, and directly deploys the learned policy to ultra-large-scale pMDPs—without explicitly unfolding their state spaces. This work constitutes the first systematic integration of decision-tree learning into pMDP policy synthesis, unifying probabilistic verification, symbolic induction, and parametric generalization. Experimental evaluation on standard benchmarks demonstrates both high policy quality and unprecedented scalability: the approach successfully handles pMDP instances orders of magnitude larger than those tractable by current state-of-the-art tools.

Technology Category

Application Category

📝 Abstract

Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such emph{huge} MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing policies for huge parameterized MDPs

Generalizing small instance policies to larger MDPs

Addressing state-space explosion in probabilistic model checking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decision-tree learning for policy generalization

Bypassing explicit large state-space exploration

Generalizing small instance policies to large MDPs

🔎 Similar Papers

No similar papers found.

Authors to Follow