Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study challenges the reliability of current LLM-driven generative agents for social network analysis due to the absence of systematic empirical validation against authentic human behavioral experience. To address this, we propose the first empirically grounded framework for assessing *empirical authenticity*, requiring agent simulations to be rigorously validated within the original data environment. Methodologically, our approach integrates LLM-based agent modeling, bilingual (English–German) user behavior simulation, formalization of socio-communicative interaction patterns, and benchmarking against real-world traces. Experimental evaluation reveals pervasive pragmatic distortions and temporal inconsistencies in existing agents. Our framework substantially enhances the credibility and interpretability of simulation outcomes in computational social science. It establishes a quantifiable, reproducible evaluation paradigm for generative social simulation—bridging the gap between synthetic agent behavior and empirically observed human dynamics.

Technology Category

Application Category

📝 Abstract
The ability of Large Language Models (LLMs) to mimic human behavior triggered a plethora of computational social science research, assuming that empirical studies of humans can be conducted with AI agents instead. Since there have been conflicting research findings on whether and when this hypothesis holds, there is a need to better understand the differences in their experimental designs. We focus on replicating the behavior of social network users with the use of LLMs for the analysis of communication on social networks. First, we provide a formal framework for the simulation of social networks, before focusing on the sub-task of imitating user communication. We empirically test different approaches to imitate user behavior on X in English and German. Our findings suggest that social simulations should be validated by their empirical realism measured in the setting in which the simulation components were fitted. With this paper, we argue for more rigor when applying generative-agent-based modeling for social simulation.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' ability to mimic human social network communication
Identifying differences in experimental designs for social simulations
Validating social simulations by measuring empirical realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Formal framework for social network simulation
Empirical testing of user behavior imitation
Validation by empirical realism measurement
🔎 Similar Papers
No similar papers found.