π€ AI Summary
Solving million-state Markov decision processes (MDPs) suffers from the βcurse of dimensionality,β resulting in poor computational efficiency, limited scalability, and inadequate native support for distributed computing.
Method: This paper introduces the first MDP solver framework that simultaneously achieves high scalability, native distributed computation, and flexible multi-algorithm support. It features a high-performance C++ core coupled with a user-friendly Python API, employs a hybrid parallel architecture leveraging both MPI and shared memory, and unifies implementation of value iteration, policy iteration, and approximate dynamic programming. Additionally, it incorporates HPC-grade memory distribution and resource scheduling mechanisms.
Results: Experiments demonstrate near-linear speedup on multi-node clusters when solving MDPs with over one million states. The framework outperforms existing tools such as pymdptoolbox by one to two orders of magnitude, substantially extending the solvable scale of large-scale sequential decision-making problems.
π Abstract
This paper introduces madupite, a high-performance distributed solver for large-scale Markov Decision Processes (MDPs). MDPs are widely used to model complex dynamical systems in various fields, including finance, epidemiology, and traffic control. However, real-world applications often result in extremely high-dimensional MDPs, leading to the curse of dimensionality, which is typically addressed through function approximators like neural networks. While existing solvers such as pymdptoolbox and mdpsolver provide tools for solving MDPs, they either lack scalability, support for distributed computing, or flexibility in solution methods. madupite is designed to overcome these limitations by leveraging modern high-performance computing resources. It efficiently distributes memory load and computation across multiple nodes, supports a diverse set of solution methods, and offers a user-friendly Python API while maintaining a C++ core for optimal performance. With the ability to solve MDPs with millions of states, madupite provides researchers and engineers with a powerful tool to tackle large-scale decision-making problems with greater efficiency and flexibility.