Digital Player: Evaluating Large Language Models based Human-like Agent in Games

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the human-like behavioral capabilities of large language model (LLM)-driven autonomous agents—termed “digital players”—in complex strategy games, with emphasis on high-level cognitive tasks including numerical reasoning, multi-step planning, diplomatic negotiation, and deceptive social interaction. To this end, we develop an application-level evaluation platform built upon the open-source strategy game *Unciv*, and propose a data-flywheel evaluation paradigm specifically designed for digital players. We formally define and quantitatively assess LLM performance across three dimensions: long-term cooperation, dynamic strategic博弈, and human-style response generation. The open-source *CivAgent* framework (available on GitHub) enables reproducible benchmarking. Experimental results reveal significant capability gaps in current state-of-the-art LLMs—particularly in sustained cooperative behavior and strategic deception—and identify concrete directions for improvement.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of Large Language Models (LLMs), LLM-based autonomous agents have shown the potential to function as digital employees, such as digital analysts, teachers, and programmers. In this paper, we develop an application-level testbed based on the open-source strategy game"Unciv", which has millions of active players, to enable researchers to build a"data flywheel"for studying human-like agents in the"digital players"task. This"Civilization"-like game features expansive decision-making spaces along with rich linguistic interactions such as diplomatic negotiations and acts of deception, posing significant challenges for LLM-based agents in terms of numerical reasoning and long-term planning. Another challenge for"digital players"is to generate human-like responses for social interaction, collaboration, and negotiation with human players. The open-source project can be found at https:/github.com/fuxiAIlab/CivAgent.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM-based agents in complex strategy games.
Challenges in numerical reasoning and long-term planning.
Generating human-like responses for social interactions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agents in strategy games
Open-source testbed for human-like agents
Focus on social interaction and planning
🔎 Similar Papers
No similar papers found.
J
Jiawei Wang
Fuxi AI Lab, NetEase Games; Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences
K
Kai Wang
Fuxi AI Lab, NetEase Games
S
Shaojie Lin
Fuxi AI Lab, NetEase Games
Runze Wu
Runze Wu
Fuxi AI Lab, NetEase Games | University of Science and Technology of China
Data MiningMachine LearningOnline Games
Bihan Xu
Bihan Xu
University of Science and Technology of China
L
Lingeng Jiang
Fuxi AI Lab, NetEase Games
Shiwei Zhao
Shiwei Zhao
NetEase Fuxi AI Lab
Data MiningUser ModelingAnomaly DetectionOnline Games
Renyu Zhu
Renyu Zhu
NetEase Fuxi AI Lab | East China Normal University
H
Haoyu Liu
Fuxi AI Lab, NetEase Games
Z
Zhipeng Hu
Fuxi AI Lab, NetEase Games
Z
Zhong Fan
Fuxi AI Lab, NetEase Games
Le Li
Le Li
Fuxi AI Lab, NetEase Games
T
Tangjie Lyu
Fuxi AI Lab, NetEase Games
C
Changjie Fan
Fuxi AI Lab, NetEase Games