PersonaArena: Dynamic Simulation for Evaluating and Enhancing Persona-Level Role-Playing in Large Language Models

๐Ÿ“… 2026-05-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

215K/year
๐Ÿค– AI Summary
This work addresses the challenge that current large language models struggle to maintain coherent and authentic persona-consistent role-playing in realistic social settings, while conventional evaluation methods rely on static setups that fail to capture the dynamic complexity of everyday interactions. To bridge this gap, the authors propose PersonaArenaโ€”a persona-level dynamic evaluation framework that constructs a fine-grained persona repository from filtered real-world social corpora, generates multi-turn, context-rich interactions within simulated environments, and incorporates a multi-agent debate mechanism for holistic automated assessment. Experimental results demonstrate that this framework significantly enhances model performance in both persona consistency and behavioral authenticity, offering a novel paradigm for developing socially adaptive AI agents.
๐Ÿ“ Abstract
Large language models (LLMs) increasingly serve as interactive social agents, yet their ability to maintain coherent and authentic persona-level role-playing remains limited, particularly in realistic social scenarios. Existing research predominantly focuses on character-level settings and relies on static evaluation formats, failing to capture the complexity of everyday social interactions. In this work, we present PersonaArena, a dynamic simulation framework for evaluating and improving persona-level role-playing in LLMs. PersonaArena leverages a large, filtered corpus of user-generated social content to construct a nuanced persona bank, and elicits multi-turn, context-rich interactions within simulated social environments. Our framework features a multi-agent debating judge for holistic and unbiased assessment. Through extensive experiments, we demonstrate that PersonaArena enables rigorous evaluation and enhancement of LLMs' role-playing capabilities, advancing the development of more authentic and socially adept AI agents.
Problem

Research questions and friction points this paper is trying to address.

persona-level role-playing
large language models
social interaction
dynamic simulation
authenticity
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic simulation
persona-level role-playing
multi-agent judging
social interaction modeling
large language models