🤖 AI Summary
Current conversational agents struggle to detect and respond to user interruptions in real time, particularly failing to distinguish cooperative interruptions (e.g., agreement, assistance, clarification) from disruptive ones. This paper introduces the first intent-aware real-time interruption handling framework, inspired by human–human dialogue patterns. Our approach integrates multimodal interruption detection—leveraging speech, prosodic, and turn-taking cues—with a lightweight intent classifier, an LLM-augmented collaborative decision module, and a dynamic dialogue state machine enabling immediate state reset and response strategy adaptation. In a user study with 21 participants, the system successfully handled 104 out of 111 interruptions (93.69%), significantly improving dialogue naturalness and task completion efficiency compared to baseline systems.
📝 Abstract
Interruptions, a fundamental component of human communication, can enhance the dynamism and effectiveness of conversations, but only when effectively managed by all parties involved. Despite advancements in robotic systems, state-of-the-art systems still have limited capabilities in handling user-initiated interruptions in real-time. Prior research has primarily focused on post hoc analysis of interruptions. To address this gap, we present a system that detects user-initiated interruptions and manages them in real-time based on the interrupter's intent (i.e., cooperative agreement, cooperative assistance, cooperative clarification, or disruptive interruption). The system was designed based on interaction patterns identified from human-human interaction data. We integrated our system into an LLM-powered social robot and validated its effectiveness through a timed decision-making task and a contentious discussion task with 21 participants. Our system successfully handled 93.69% (n=104/111) of user-initiated interruptions. We discuss our learnings and their implications for designing interruption-handling behaviors in conversational robots.