Digital Player: Evaluating Large Language Models based Human-like Agent in Games

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work investigates the human-like behavioral capabilities of large language model (LLM)-driven autonomous agents—termed “digital players”—in complex strategy games, with emphasis on high-level cognitive tasks including numerical reasoning, multi-step planning, diplomatic negotiation, and deceptive social interaction. To this end, we develop an application-level evaluation platform built upon the open-source strategy game *Unciv*, and propose a data-flywheel evaluation paradigm specifically designed for digital players. We formally define and quantitatively assess LLM performance across three dimensions: long-term cooperation, dynamic strategic博弈, and human-style response generation. The open-source *CivAgent* framework (available on GitHub) enables reproducible benchmarking. Experimental results reveal significant capability gaps in current state-of-the-art LLMs—particularly in sustained cooperative behavior and strategic deception—and identify concrete directions for improvement.

Technology Category

Application Category

📝 Abstract

With the rapid advancement of Large Language Models (LLMs), LLM-based autonomous agents have shown the potential to function as digital employees, such as digital analysts, teachers, and programmers. In this paper, we develop an application-level testbed based on the open-source strategy game"Unciv", which has millions of active players, to enable researchers to build a"data flywheel"for studying human-like agents in the"digital players"task. This"Civilization"-like game features expansive decision-making spaces along with rich linguistic interactions such as diplomatic negotiations and acts of deception, posing significant challenges for LLM-based agents in terms of numerical reasoning and long-term planning. Another challenge for"digital players"is to generate human-like responses for social interaction, collaboration, and negotiation with human players. The open-source project can be found at https:/github.com/fuxiAIlab/CivAgent.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM-based agents in complex strategy games.

Challenges in numerical reasoning and long-term planning.

Generating human-like responses for social interactions.

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agents in strategy games

Open-source testbed for human-like agents

Focus on social interaction and planning

🔎 Similar Papers

A Survey on Large Language Model-Based Game Agents