🤖 AI Summary
Existing LLM alignment research employing synthetic personas suffers from inadequate representativeness and low ecological validity: target tasks and user populations are frequently undefined, violating foundational principles of personalized modeling. Method: We systematically reviewed 63 relevant studies using a dual framework of content analysis and methodological assessment, applying multidimensional coding across sociodemographic dimensions and task-relevance criteria. Contribution/Results: Our analysis reveals that only 35% of studies address persona representativeness. To address this gap, we introduce the first transparency checklist specifically designed for LLM persona experiments—emphasizing empirically grounded sampling and context anchoring. This tool shifts persona construction from ad hoc, intuition-driven practices toward methodologically rigorous, reproducible protocols. It provides actionable guidance to enhance the scientific validity and fairness of personalized LLM evaluation, supporting more robust and socially accountable alignment research.
📝 Abstract
Synthetic personae experiments have become a prominent method in Large Language Model alignment research, yet the representativeness and ecological validity of these personae vary considerably between studies. Through a review of 63 peer-reviewed studies published between 2023 and 2025 in leading NLP and AI venues, we reveal a critical gap: task and population of interest are often underspecified in persona-based experiments, despite personalization being fundamentally dependent on these criteria. Our analysis shows substantial differences in user representation, with most studies focusing on limited sociodemographic attributes and only 35% discussing the representativeness of their LLM personae. Based on our findings, we introduce a persona transparency checklist that emphasizes representative sampling, explicit grounding in empirical data, and enhanced ecological validity. Our work provides both a comprehensive assessment of current practices and practical guidelines to improve the rigor and ecological validity of persona-based evaluations in language model alignment research.