FinBoardBench: Benchmarking Dynamic Wealth Management and Strategic Financial Reasoning of LLMs via Board Game Simulations

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing static financial benchmarks inadequately assess the true capabilities of large language models (LLMs) in dynamic wealth management. This work addresses this gap by formally modeling three classic financial board games—Cashflow, Acquire, and Monopoly—as multi-agent environments to construct a dynamic evaluation framework. The study systematically evaluates nine state-of-the-art LLMs on long-term decision-making tasks involving cash management, merger-and-acquisition investments, and asset bidding. Results reveal that while these models exhibit basic investment reasoning, they consistently overlook liquidity risk and are prone to financial distress under stochastic shocks. This highlights a significant disparity between their strong static reasoning abilities and weaker dynamic decision-making performance, underscoring a critical bottleneck in translating static financial knowledge into sustained, adaptive gains in volatile environments.
📝 Abstract
Recently, large language models (LLMs) have achieved superior performance in static financial reasoning and simple dynamic trading tasks. However, existing static financial benchmarks are insufficient to assess the dynamic wealth management and financial decision-making capabilities of LLMs in real-world environments. To bridge this gap, we present FinBoardBench, an evaluation suite based on three classic financial board games: Cashflow, Acquire, and Monopoly. FinBoardBench assesses a comprehensive set of financial skills, including personal cash flow management with debt balancing, corporate investment and acquisition forecasting, and competitive trade negotiations with asset auctions. Our experiments with 9 advanced LLMs reveal that while exhibiting basic long-term planning and investment logic, they fail to effectively leverage complex interactions for profit, and their strong static reasoning performance does not transform into successful dynamic decision-making. Notably, they tend to prioritize immediate asset acquisition over maintaining sufficient liquidity, making them vulnerable to financial crises triggered by random events. We hope that FinBoardBench can provide a valuable reference for more intelligent LLM-based decision-making systems in the future.
Problem

Research questions and friction points this paper is trying to address.

dynamic wealth management
financial reasoning
LLMs
financial decision-making
benchmarking
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic wealth management
financial reasoning
board game simulation
LLM evaluation benchmark
liquidity risk
🔎 Similar Papers
No similar papers found.
X
Xuesi Hu
School of Computer Science and Engineering, Macau University of Science and Technology, Macau, China
Peng Wang
Peng Wang
Macau University of Science and Technology
Natural Language ProcessingLarge Language ModelAgentic System
Jinpeng Miao
Jinpeng Miao
Research Scientist at Meta Platforms, Inc.
CybersecurityEdge ComputingSecure Coding
X
Xilin Tao
School of Computer Science and Engineering, Macau University of Science and Technology, Macau, China
C
Caiwei Li
Department of Computer and Information Science, University of Macau, Macau, China
Yue Ma
Yue Ma
Bytedance
NLPDialogue SystemLLM
Jie He
Jie He
Professor of Computer Science, University of Science and Technology Beijing
Indoor localizationIntenet of things and Machine Learning
Q
Qiancheng Zhang
School of Economics, Anhui University, Anhui, China
Y
Yuntao Zou
School of Energy and Power Engineering, Huazhong University of Science and Technology, Hubei, China
Dagang Li
Dagang Li
Macau University of Science and Technology
NetworkGraphTime seriesRLLLM