How Far Are LLMs from Professional Poker Players? Revisiting Game-Theoretic Reasoning with Agentic Tool Use

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Large language models often underperform in high-stakes strategic games like poker due to reliance on heuristics, factual misunderstandings, and inconsistencies between stated reasoning and actual actions, falling short of both expert human players and conventional game-theoretic algorithms. This work proposes ToolPoker, a novel framework that integrates an external game-theoretic optimal (GTO) solver with a large language model through a tool-calling mechanism, behavioral cloning, and stepwise reinforcement learning. This integration enables the model to produce reasoning that is both game-theoretically consistent and interpretable by domain experts. Experimental results demonstrate that ToolPoker achieves state-of-the-art performance on real-world poker tasks and generates reasoning trajectories that closely align with established game-theoretic principles.

Technology Category

Application Category

📝 Abstract

As Large Language Models (LLMs) are increasingly applied in high-stakes domains, their ability to reason strategically under uncertainty becomes critical. Poker provides a rigorous testbed, requiring not only strong actions but also principled, game-theoretic reasoning. In this paper, we conduct a systematic study of LLMs in multiple realistic poker tasks, evaluating both gameplay outcomes and reasoning traces. Our analysis reveals LLMs fail to compete against traditional algorithms and identifies three recurring flaws: reliance on heuristics, factual misunderstandings, and a"knowing-doing"gap where actions diverge from reasoning. An initial attempt with behavior cloning and step-level reinforcement learning improves reasoning style but remains insufficient for accurate game-theoretic play. Motivated by these limitations, we propose ToolPoker, a tool-integrated reasoning framework that combines external solvers for GTO-consistent actions with more precise professional-style explanations. Experiments demonstrate that ToolPoker achieves state-of-the-art gameplay while producing reasoning traces that closely reflect game-theoretic principles.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Game-Theoretic Reasoning

Poker

Strategic Decision-Making

Uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

ToolPoker

game-theoretic reasoning

GTO