Evaluating Counterfactual Strategic Reasoning in Large Language Models

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This study investigates whether the strategic behavior of large language models in game-theoretic settings stems from genuine reasoning or reliance on memorized patterns. To disentangle memory from reasoning, the authors introduce a novel counterfactual game framework—systematically altering incentive structures in canonical games such as the Prisoner’s Dilemma and Rock-Paper-Scissors to break inherent symmetries and dominance relationships. Employing a multidimensional evaluation protocol grounded in game-theoretic analysis, the research demonstrates that models struggle to adapt to modified incentives in counterfactual scenarios, exhibiting strong dependence on familiar patterns encountered during training. These findings reveal a fundamental limitation: current models lack robust strategic generalization and true reasoning capabilities in dynamic decision-making contexts.

Technology Category

Application Category

📝 Abstract

We evaluate Large Language Models (LLMs) in repeated game-theoretic settings to assess whether strategic performance reflects genuine reasoning or reliance on memorized patterns. We consider two canonical games, Prisoner's Dilemma (PD) and Rock-Paper-Scissors (RPS), upon which we introduce counterfactual variants that alter payoff structures and action labels, breaking familiar symmetries and dominance relations. Our multi-metric evaluation framework compares default and counterfactual instantiations, showcasing LLM limitations in incentive sensitivity, structural generalization and strategic reasoning within counterfactual environments.

Problem

Research questions and friction points this paper is trying to address.

Counterfactual Reasoning

Strategic Reasoning

Large Language Models

Game Theory

Incentive Sensitivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

counterfactual reasoning

strategic reasoning

large language models