Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) frequently exhibit out-of-character (OOC) behavior in open-ended generation—deviating from predefined persona traits—thereby undermining reliability. Existing evaluation methods, relying on coarse-grained global scoring, fail to detect fine-grained persona inconsistencies. Method: We propose the first atomic-level persona fidelity evaluation framework, introducing three computable metrics: (1) single-turn persona alignment, (2) cross-turn consistency, and (3) task-persona coupling effect. The framework integrates fine-grained human annotation, contrastive analysis, and human perception modeling to enable stable, multi-task, multi-persona quantification. Contribution/Results: Our approach significantly improves detection of latent OOC phenomena, achieving more precise and robust persona consistency assessment across diverse generation scenarios. Experimental results demonstrate superior sensitivity to subtle deviations compared to prior global metrics, enabling reliable persona-aware LLM evaluation.

Technology Category

Application Category

📝 Abstract
Ensuring persona fidelity in large language models (LLMs) is essential for maintaining coherent and engaging human-AI interactions. However, LLMs often exhibit Out-of-Character (OOC) behavior, where generated responses deviate from an assigned persona, leading to inconsistencies that affect model reliability. Existing evaluation methods typically assign single scores to entire responses, struggling to capture subtle persona misalignment, particularly in long-form text generation. To address this limitation, we propose an atomic-level evaluation framework that quantifies persona fidelity at a finer granularity. Our three key metrics measure the degree of persona alignment and consistency within and across generations. Our approach enables a more precise and realistic assessment of persona fidelity by identifying subtle deviations that real users would encounter. Through our experiments, we demonstrate that our framework effectively detects persona inconsistencies that prior methods overlook. By analyzing persona fidelity across diverse tasks and personality types, we reveal how task structure and persona desirability influence model adaptability, highlighting challenges in maintaining consistent persona expression.
Problem

Research questions and friction points this paper is trying to address.

Detecting subtle persona deviations in LLM responses
Evaluating persona fidelity at atomic-level granularity
Assessing consistency across diverse tasks and personalities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Atomic-level evaluation framework for persona fidelity
Three key metrics for persona alignment assessment
Detects subtle deviations in long-form text generation
🔎 Similar Papers
No similar papers found.
J
Jisu Shin
School of Computing, Korea Advanced Institute of Science and Technology (KAIST)
J
Juhyun Oh
School of Computing, Korea Advanced Institute of Science and Technology (KAIST)
Eunsu Kim
Eunsu Kim
KAIST
AINLP
Hoyun Song
Hoyun Song
Postdoctoral researcher, KAIST
NLPKnowledge IntegrationDomain-Specific ModelingLLM
Alice Oh
Alice Oh
KAIST Computer Science
machine learningNLPcomputational social science