madupite: A High-Performance Distributed Solver for Large-Scale Markov Decision Processes

πŸ“… 2025-02-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Solving million-state Markov decision processes (MDPs) suffers from the β€œcurse of dimensionality,” resulting in poor computational efficiency, limited scalability, and inadequate native support for distributed computing. Method: This paper introduces the first MDP solver framework that simultaneously achieves high scalability, native distributed computation, and flexible multi-algorithm support. It features a high-performance C++ core coupled with a user-friendly Python API, employs a hybrid parallel architecture leveraging both MPI and shared memory, and unifies implementation of value iteration, policy iteration, and approximate dynamic programming. Additionally, it incorporates HPC-grade memory distribution and resource scheduling mechanisms. Results: Experiments demonstrate near-linear speedup on multi-node clusters when solving MDPs with over one million states. The framework outperforms existing tools such as pymdptoolbox by one to two orders of magnitude, substantially extending the solvable scale of large-scale sequential decision-making problems.

Technology Category

Application Category

πŸ“ Abstract
This paper introduces madupite, a high-performance distributed solver for large-scale Markov Decision Processes (MDPs). MDPs are widely used to model complex dynamical systems in various fields, including finance, epidemiology, and traffic control. However, real-world applications often result in extremely high-dimensional MDPs, leading to the curse of dimensionality, which is typically addressed through function approximators like neural networks. While existing solvers such as pymdptoolbox and mdpsolver provide tools for solving MDPs, they either lack scalability, support for distributed computing, or flexibility in solution methods. madupite is designed to overcome these limitations by leveraging modern high-performance computing resources. It efficiently distributes memory load and computation across multiple nodes, supports a diverse set of solution methods, and offers a user-friendly Python API while maintaining a C++ core for optimal performance. With the ability to solve MDPs with millions of states, madupite provides researchers and engineers with a powerful tool to tackle large-scale decision-making problems with greater efficiency and flexibility.
Problem

Research questions and friction points this paper is trying to address.

Distributed solver for large-scale Markov Decision Processes
Overcomes scalability issues in existing MDP solvers
Supports diverse methods for high-dimensional decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed memory load across nodes
Supports diverse solution methods
Python API with C++ core
πŸ”Ž Similar Papers
No similar papers found.
Matilde Gargiani
Matilde Gargiani
ETH Zurich
dynamic programmingreinforcement learningoptimizationmachine learningdeep learning
R
Robin Sieber
Automatic Control Laboratory (IfA), ETH Zurich, 8092 Zurich, Switzerland
P
Philip Pawlowsky
Automatic Control Laboratory (IfA), ETH Zurich, 8092 Zurich, Switzerland
V
VΓ‘clav Hapla
Department of Earth and Planetary Sciences, ETH Zurich, 8092 Zurich, Switzerland; Department of Applied Mathematics, FEECS at VSB-TU Ostrava, Czechia
John Lygeros
John Lygeros
Prof. of Computation and Control, ETH Zurich
Automatic controlsystems biologypower systemsair traffic management