M2-PALE: A Framework for Explaining Multi-Agent MCTS--Minimax Hybrids via Process Mining and LLMs

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses the limited interpretability of existing MCTS-Minimax hybrid approaches for multi-agent decision-making and the tendency of standard Monte Carlo Tree Search (MCTS) to overlook critical actions or become trapped in local tactical optima. To enhance strategic depth, the authors propose embedding shallow full-width Minimax search within the rollout phase of MCTS. Furthermore, they introduce a novel integration of process mining techniques—such as Alpha Miner and Inductive Miner—with large language models to structurally model agent behavior trajectories and generate human-readable causal and root-cause explanations. Experimental validation in a small checkers environment demonstrates the effectiveness of the approach, offering a scalable framework for explainable hybrid agents in complex strategic scenarios.

Technology Category

Application Category

📝 Abstract

Monte-Carlo Tree Search (MCTS) is a fundamental sampling-based search algorithm widely used for online planning in sequential decision-making domains. Despite its success in driving recent advances in artificial intelligence, understanding the behavior of MCTS agents remains a challenge for both developers and users. This difficulty stems from the complex search trees produced through the simulation of numerous future states and their intricate relationships. A known weakness of standard MCTS is its reliance on highly selective tree construction, which may lead to the omission of crucial moves and a vulnerability to tactical traps. To resolve this, we incorporate shallow, full-width Minimax search into the rollout phase of multi-agent MCTS to enhance strategic depth. Furthermore, to demystify the resulting decision-making logic, we introduce \textsf{M2-PALE} (MCTS--Minimax Process-Aided Linguistic Explanations). This framework employs process mining techniques, specifically the Alpha Miner, iDHM, and Inductive Miner algorithms, to extract underlying behavioral workflows from agent execution traces. These process models are then synthesized by LLMs to generate human-readable causal and distal explanations. We demonstrate the efficacy of our approach in a small-scale checkers environment, establishing a scalable foundation for interpreting hybrid agents in increasingly complex strategic domains.

Problem

Research questions and friction points this paper is trying to address.

MCTS

Minimax

Explainability

Multi-Agent

Decision-Making

Innovation

Methods, ideas, or system contributions that make the work stand out.

MCTS-Minimax hybrid

process mining

LLM-based explanation