The price of decentralization in managing engineering systems through multi-agent reinforcement learning

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This study addresses the coordination challenge in decentralized inspection and maintenance planning for multi-component engineering systems by formulating the problem as a partially observable Markov decision process and employing multi-agent deep reinforcement learning (MADRL) approaches. The work encompasses a spectrum of training paradigms—from fully centralized to fully decentralized—including value decomposition and actor-critic architectures. A novel benchmark environment with tunable redundancy is introduced to systematically evaluate, for the first time, the coordination capabilities and policy optimality of various MADRL algorithms. Experimental results demonstrate that decentralized policies achieve near-optimal performance in low-redundancy series systems; however, coordination complexity increases significantly with higher redundancy. Despite this, all MADRL strategies consistently outperform an optimized heuristic baseline across tested scenarios.

Technology Category

Application Category

📝 Abstract

Inspection and maintenance (I&M) planning involves sequential decision making under uncertainties and incomplete information, and can be modeled as a partially observable Markov decision process (POMDP). While single-agent deep reinforcement learning provides approximate solutions to POMDPs, it does not scale well in multi-component systems. Scalability can be achieved through multi-agent deep reinforcement learning (MADRL), which decentralizes decision-making across multiple agents, locally controlling individual components. However, this decentralization can induce cooperation pathologies that degrade the optimality of the learned policies. To examine these effects in I&M planning, we introduce a set of deteriorating systems in which redundancy is varied systematically. These benchmark environments are designed such that computation of centralized (near-)optimal policies remains tractable, enabling direct comparison of solution methods. We implement and benchmark a broad set of MADRL algorithms spanning fully centralized and decentralized training paradigms, from value-factorization to actor-critic methods. Our results show a clear effect of redundancy on coordination: MADRL algorithms achieve near-optimal performance in series-like settings, whereas increasing redundancy amplifies coordination challenges and can lead to optimality losses. Nonetheless, decentralized agents learn structured policies that consistently outperform optimized heuristic baselines, highlighting both the promise and current limitations of decentralized learning for scalable maintenance planning.

Problem

Research questions and friction points this paper is trying to address.

decentralization

multi-agent reinforcement learning

inspection and maintenance planning

coordination

redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent reinforcement learning

decentralization cost

inspection and maintenance planning