Boltzmann-based Exploration for Robust Decentralized Multi-Agent Planning

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of exploration in decentralized multi-agent planning under sparse or skewed reward conditions by proposing Coordinated Boltzmann MCTS (CB-MCTS). The method introduces, for the first time, a Boltzmann exploration mechanism into the decentralized multi-agent Monte Carlo Tree Search (MCTS) framework, augmented with a decaying entropy-based reward to simultaneously preserve exploration breadth and enhance policy focus. By designing an entropy regulation strategy tailored to multi-agent settings, CB-MCTS significantly improves the robustness and effectiveness of collaborative planning. Empirical results demonstrate that CB-MCTS substantially outperforms conventional Dec-MCTS in deceptive tasks while maintaining competitive performance on standard benchmarks.

Technology Category

Application Category

📝 Abstract
Decentralized Monte Carlo Tree Search (Dec-MCTS) is widely used for cooperative multi-agent planning but struggles in sparse or skewed reward environments. We introduce Coordinated Boltzmann MCTS (CB-MCTS), which replaces deterministic UCT with a stochastic Boltzmann policy and a decaying entropy bonus for sustained yet focused exploration. While Boltzmann exploration has been studied in single-agent MCTS, applying it in multi-agent systems poses unique challenges. CB-MCTS is the first to address this. We analyze CB-MCTS in the simple-regret setting and show in simulations that it outperforms Dec-MCTS in deceptive scenarios and remains competitive on standard benchmarks, providing a robust solution for multi-agent planning.
Problem

Research questions and friction points this paper is trying to address.

Decentralized Multi-Agent Planning
Sparse Rewards
Skewed Reward Environments
Exploration
Robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Boltzmann exploration
Decentralized MCTS
multi-agent planning
entropy bonus
stochastic policy
🔎 Similar Papers
No similar papers found.