Conversational Behavior Modeling Foundation Model With Multi-Level Perception

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of modeling and naturally expressing implicit chains of thought in full-duplex dialogue by proposing a Graph-of-Thought (GoT) architecture. The approach employs a multi-level perceptual framework to capture the causal and temporal dependencies from communicative intent to verbal behavior, integrating a hierarchical annotation scheme, graph-structured reasoning, and a Transformer backbone to enable streaming dynamic inference on high-quality, human-annotated, controllable dialogue corpora. The study introduces the first foundational model tailored for full-duplex dialogue behavior modeling, demonstrating robust behavior detection and interpretable chain-of-thought generation on both synthetic and real-world data, thereby establishing a new benchmark for dialogue reasoning.

Technology Category

Application Category

📝 Abstract
Human conversation is organized by an implicit chain of thoughts that manifests as timed speech acts. Capturing this perceptual pathway is key to building natural full-duplex interactive systems. We introduce a framework that models this process as multi-level perception, and then reasons over conversational behaviors via a Graph-of-Thoughts (GoT). Our approach formalizes the intent-to-action pathway with a hierarchical labeling scheme, predicting high-level communicative intents and low-level speech acts to learn their causal and temporal dependencies. To train this system, we develop a high quality corpus that pairs controllable, event-rich dialogue data with human-annotated labels. The GoT framework structures streaming predictions as an evolving graph, enabling a transformer to forecast the next speech act, generate concise justifications for its decisions, and dynamically refine its reasoning. Experiments on both synthetic and real duplex dialogues show that the framework delivers robust behavior detection, produces interpretable reasoning chains, and establishes a foundation for benchmarking conversational reasoning in full duplex spoken dialogue systems.
Problem

Research questions and friction points this paper is trying to address.

conversational behavior
multi-level perception
full-duplex dialogue
speech acts
chain of thoughts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-of-Thoughts
multi-level perception
conversational behavior modeling
full-duplex dialogue
hierarchical intent-action modeling
🔎 Similar Papers
No similar papers found.
D
Dingkun Zhou
University of California, Berkeley, Berkeley, CA, USA; South China University of Technology, Guangzhou, Guangdong, China
S
Shuchang Pan
Zhejiang University, Hangzhou, Zhejiang, China
Jiachen Lian
Jiachen Lian
UC Berkeley
precision healthcarespeech processingmachine learning
Siddharth Banerjee
Siddharth Banerjee
Assistant Professor
Workforce DevelopmentProject Risk ManagementText AnalyticsData Visualization
S
Sarika Pasumarthy
University of California, Berkeley, Berkeley, CA, USA
D
Dhruv Hebbar
University of California, Berkeley, Berkeley, CA, USA
S
Siddhant Patel
University of California, Berkeley, Berkeley, CA, USA
Z
Zeyi Austin Li
University of California, Berkeley, Berkeley, CA, USA
K
Kan Jen Cheng
University of California, Berkeley, Berkeley, CA, USA
S
Sanay Bordia
University of California, Berkeley, Berkeley, CA, USA
K
Krish Patel
University of California, Berkeley, Berkeley, CA, USA
A
Akshaj Gupta
University of California, Berkeley, Berkeley, CA, USA
Tingle Li
Tingle Li
PhD Student, UC Berkeley
Multimodal LearningAuditory PerceptionSpeech ProcessingComputer Vision
G
Gopala Anumanchipalli
University of California, Berkeley, Berkeley, CA, USA