๐ค AI Summary
This work addresses the unnatural interaction in existing stepwise AI chat systems, which stems from their failure to model human conversational timing. To overcome this limitation, we propose Stephanie2, a novel stepwise decision-making dialogue agent that jointly models thinking and typing durations for the first time and incorporates an active waiting mechanism to explicitly decideโ at each stepโwhether to send or delay a message. Leveraging a time-window-based dual-agent dialogue generation framework, Stephanie2 adaptively modulates message pacing and generates synthetic dialogue histories for evaluation. Experimental results demonstrate that Stephanie2 significantly outperforms prior models in perceived naturalness and engagement, and achieves a higher human pass rate in role-identification Turing tests.
๐ Abstract
Instant-messaging human social chat typically progresses through a sequence of short messages. Existing step-by-step AI chatting systems typically split a one-shot generation into multiple messages and send them sequentially, but they lack an active waiting mechanism and exhibit unnatural message pacing. In order to address these issues, we propose Stephanie2, a novel next-generation step-wise decision-making dialogue agent. With active waiting and message-pace adaptation, Stephanie2 explicitly decides at each step whether to send or wait, and models latency as the sum of thinking time and typing time to achieve more natural pacing. We further introduce a time-window-based dual-agent dialogue system to generate pseudo dialogue histories for human and automatic evaluations. Experiments show that Stephanie2 clearly outperforms Stephanie1 on metrics such as naturalness and engagement, and achieves a higher pass rate on human evaluation with the role identification Turing test.