Don't Make the LLM Read the Graph: Make the Graph Think

📅 2026-04-24

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work investigates how belief graphs can enhance higher-order theory of mind capabilities in large language models during multi-agent collaborative reasoning. Drawing on over 3,000 controlled experiments in the Hanabi card game, the study introduces a structured action-selection gating mechanism and systematically evaluates the efficacy of belief graphs under various integration architectures. It identifies a “planner disobedience” phenomenon, demonstrates that shallow belief graphs offer the best trade-off between performance and complexity, and shows that multi-agent coordination protocols substantially outperform single-agent interventions. Notably, in second-order theory-of-mind tasks, a shortlist-gated belief graph elevates strong models’ success rate from 20% to 100% and improves collaborative strategy scores by 128% over the baseline.

Technology Category

Application Category

📝 Abstract

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanabi, we establish four findings. First, integration architecture determines whether belief graphs provide value: as prompt context, graphs are decorative for strong models and beneficial only for weak models on 2nd-order Theory of Mind (80% vs 10%, p<0.0001, OR=36.0); when graphs gate action selection through ranked shortlists, they become structurally essential even for strong models (100% vs 20% on 2nd-order ToM, p<0.001). Second, we identify "Planner Defiance," a model-family-specific failure where LLMs override correct planner recommendations at partial competence (90% override, replicated N=20); Gemini models show near-zero defiance while Llama 70B shows 90%, and models distinguish factual context (deferred to) from advisory recommendations (overridden). Third, full-game evidence confirms inter-agent conventions (+128% over baseline, p=0.003) outperform all single-agent interventions, and individual belief-graph components must be combined to produce gains. Fourth, preliminary scaling analysis (N=10/cell, exploratory) suggests graph depth has diminishing returns: shallow graphs provide the best cost-benefit ratio, while deeper ToM graphs appear harmful at larger player counts (-1.5 pts at 5-player, p=0.029).

Problem

Research questions and friction points this paper is trying to address.

belief graphs

multi-agent reasoning

Theory of Mind

LLM performance

cooperative AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

belief graphs

multi-agent reasoning

Planner Defiance