๐ค AI Summary
This work addresses the challenges of role drift, identity confusion, and echo effects in large language models (LLMs) during multi-turn dialogues, which undermine conversational consistency. The authors propose a stability-oriented modular framework comprising three stages: role construction, clientโresponder generation, and termination detection. Central to this approach are viewpoint-invariant historical representations and a novel Egocentric Context Projection (ECP) mechanism, which enhance role stability without modifying model weights. Notably, this method is the first to systematically eliminate echo phenomena in LLM-to-LLM conversations. Experiments involving 4,500 distinct roles and 45,000 dialogue turns demonstrate that ECP substantially reduces role drift; human evaluations confirm the complete absence of echo effects, while embedding analyses reveal well-defined role structures and interaction geometries.
๐ Abstract
Large language models are increasingly deployed in multi-turn settings such as tutoring, support, and counseling, where reliability depends on preserving consistent roles, personas, and goals across long horizons. This requirement becomes critical when LLMs are used to generate synthetic dialogues for training and evaluation, since LLM--LLM conversations can accumulate identity-related failures such as persona drift, role confusion, and"echoing", where one agent gradually mirrors its partner. We introduce SPASM (Stable Persona-driven Agent Simulation for Multi-turn dialogue generation), a modular, stability-first framework that decomposes simulation into (i) persona creation via schema sampling, plausibility validation, and natural-language persona crafting, (ii) Client--Responder dialogue generation, and (iii) termination detection for coherent stopping. To improve long-horizon stability without changing model weights, we propose Egocentric Context Projection (ECP): dialogue history is stored in a perspective-agnostic representation and deterministically projected into each agent's egocentric view before generation. Across three LLM backbones (GPT-4o-mini, DeepSeek-V3.2, Qwen-Plus) and nine Client--Responder pairings, we construct a dataset of 4,500 personas and 45,000 conversations (500 personas X 10 conversations per pairing). Ablations show ECP substantially reduces persona drift and, under human validation, eliminates echoing; embedding analyses recover persona structure and reveal strong responder-driven interaction geometry. Our code is available at https://github.com/lhannnn/SPASM.