Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Directly adapting large language models (LLMs) to multi-agent systems (MAS) faces challenges including complex reward modeling, highly dynamic agent interactions, and stringent generalization requirements. Method: This paper proposes a domain-aligned post-training paradigm, using economics as a structured testbed; it combines supervised fine-tuning (SFT) with reinforcement learning from verifiable rewards (RLVR) on a high-quality, self-constructed economic reasoning dataset (2,100 problems) to train a 7B open-weight LLM. Contribution/Results: This work introduces RLVR to economic reasoning for the first time, significantly improving the model’s equilibrium prediction accuracy, strategic consistency, and economic rationality in unseen multi-agent games. It empirically demonstrates that post-training can effectively induce cross-task strategic generalization and shape agent alignment with rational behavioral patterns.

Technology Category

Application Category

📝 Abstract

Directly training Large Language Models (LLMs) for Multi-Agent Systems (MAS) remains challenging due to intricate reward modeling, dynamic agent interactions, and demanding generalization requirements. This paper explores whether post-training techniques, specifically Supervised Fine-Tuning (SFT) and Reinforcement Learning with Verifiable Rewards (RLVR), can effectively $ extit{generalize}$ to multi-agent scenarios. We use economic reasoning as a testbed, leveraging its strong foundations in mathematics and game theory, its demand for structured analytical reasoning, and its relevance to real-world applications such as market design, resource allocation, and policy analysis. We introduce $ extbf{Recon}$ ($ extbf{R}$easoning like an $ extbf{ECON}$omist), a 7B-parameter open-source LLM post-trained on a hand-curated dataset of 2,100 high-quality economic reasoning problems. Comprehensive evaluation on economic reasoning benchmarks and multi-agent games reveals clear improvements in structured reasoning and economic rationality. These results underscore the promise of domain-aligned post-training for enhancing reasoning and agent alignment, shedding light on the roles of SFT and RL in shaping model behavior. Code is available at https://github.com/MasterZhou1/Recon .

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLMs' generalization in multi-agent systems via post-training

Applying economic reasoning to improve structured analytical capabilities

Evaluating SFT and RLVR for strategic reasoning in game theory

Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-training with SFT and RLVR techniques

Uses economic reasoning for multi-agent generalization

Introduces Recon model for structured economic analysis

🔎 Similar Papers

No similar papers found.