Real-Time Parallel Counterfactual Regret Minimization

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the challenge of computing near-equilibrium strategies within seconds in real-time imperfect-information games, where existing single-threaded Counterfactual Regret Minimization (CFR) algorithms fall short. The paper introduces Parallel CFR, a novel framework that achieves depth-limited parallelization while supporting pruning, abstraction, and advanced CFR variants for the first time. By employing a seven-stage pipeline architecture, it exploits dual-dimensional parallelism across both information sets and tree nodes, and integrates a CPU-GPU heterogeneous system with batched neural network inference on the GPU to accelerate leaf-node evaluations. On a single desktop machine, the approach achieves a 3.3–3.4× speedup in post-flop CFR iterations, reducing iteration time to just 47–54 milliseconds—enabling hundreds of iterations within typical real-time constraints and substantially improving solution efficiency.

📝 Abstract

Counterfactual Regret Minimization (CFR) is the dominant algorithmic family for solving large imperfect-information games, underpinning breakthroughs such as Libratus and Pluribus in No-Limit Texas Hold'em poker. In real-time game-playing systems, the solver must compute a near-equilibrium strategy within a strict time budget of only a few seconds per decision, and the number of CFR iterations completed in this window directly determines play strength. We present \textbf{Parallel CFR}, the first parallelization framework for real-time depth-limited CFR solving that seamlessly integrates pruning, abstraction, and advanced CFR variants. We decompose each CFR iteration into a pipeline of seven stages and identify two orthogonal dimensions of parallelism: \emph{by information set} and \emph{by tree node}. Leaf node evaluation is offloaded to GPUs via batched neural network inference, creating a heterogeneous CPU--GPU pipeline. Experiments on Heads-Up No-Limit Texas Hold'em demonstrate that Parallel CFR achieves $3.3$--$3.4\times$ speedup over the single-threaded baseline on postflop streets, with per-iteration time of ${\sim}47$--$54$~ms on a depth-limited game tree with over $1$ billion histories. All experiments run on a single desktop-class device (NVIDIA DGX Spark), enabling hundreds of CFR iterations within a typical real-time decision budget without requiring datacenter-scale infrastructure.

Problem

Research questions and friction points this paper is trying to address.

Real-Time Solving

Counterfactual Regret Minimization

Imperfect-Information Games

Computational Efficiency

Decision Time Budget

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel CFR

real-time solving

heterogeneous CPU-GPU pipeline