🤖 AI Summary
This study investigates how interlocutor persona information influences response consistency and speaker identifiability in dialogue generation. We propose an author identification–based evaluation framework integrating LLM-as-a-judge automated assessment, human evaluation, controlled persona masking/disclosure experiments, and multi-role–multi-topic dialogue pair analysis. Our work is the first to systematically reveal the critical role of interlocutor persona in speaker identification. We identify a novel zero-shot phenomenon wherein large language models over-generate biographical details—degrading response depth and coherence. Empirical results confirm that interlocutor persona significantly improves target speaker identification accuracy; models exhibit strong cross-topic generalization but suffer substantial performance degradation when encountering unfamiliar interlocutors. The findings underscore persona’s underappreciated yet pivotal function in grounding speaker identity and contextual coherence in generative dialogue systems.
📝 Abstract
Endowing dialogue agents with persona information has proven to significantly improve the consistency and diversity of their generations. While much focus has been placed on aligning dialogues with provided personas, the adaptation to the interlocutor's profile remains largely underexplored. In this work, we investigate three key aspects: (1) a model's ability to align responses with both the provided persona and the interlocutor's; (2) its robustness when dealing with familiar versus unfamiliar interlocutors and topics, and (3) the impact of additional fine-tuning on specific persona-based dialogues. We evaluate dialogues generated with diverse speaker pairings and topics, framing the evaluation as an author identification task and employing both LLM-as-a-judge and human evaluations. By systematically masking or disclosing information about the interlocutor, we assess its impact on dialogue generation. Results show that access to the interlocutor's persona improves the recognition of the target speaker, while masking it does the opposite. Although models generalise well across topics, they struggle with unfamiliar interlocutors. Finally, we found that in zero-shot settings, LLMs often copy biographical details, facilitating identification but trivialising the task.