GEM: Graph-Enhanced Mixture-of-Experts with ReAct Agents for Dialogue State Tracking

📅 2026-05-05
📈 Citations: 0
Influential: 0
📄 PDF

career value

193K/year
📝 Abstract
Dialogue State Tracking (DST) requires precise extraction of structured information from multi-domain conversations, a task where Large Language Models (LLMs) struggle despite their impressive general capabilities. We present GEM (Graph-Enhanced Mixture-of-Experts), a novel framework that combines language models and graph-structured dialogue understanding with ReAct agent-based reasoning for superior DST performance. Our approach dynamically routes between specialized experts: a Graph Neural Network that captures dialogue structure and turn-level dependencies, and a finetuned T5-Small encoder-decoder for sequence modeling, coordinated by an intelligent router. For complex value generation tasks, we integrate ReAct agents that perform structured reasoning over dialogue context. On MultiWOZ 2.2, GEM achieves 65.19% Joint Goal Accuracy, substantially outperforming end-to-end LLM approaches (best: 38.43%) and surpassing state-of-the-art (SOTA) methods including TOATOD (63.79%), D3ST (58.70%), and Diable (56.48%). Our graph-enhanced mixture-of-experts architecture with ReAct integration demonstrates that combining structured dialogue representation with dynamic expert routing and agent-based reasoning provides a powerful paradigm for dialogue state tracking, achieving superior accuracy while maintaining computational efficiency through selective expert activation.
Problem

Research questions and friction points this paper is trying to address.

Dialogue State Tracking
Large Language Models
Structured Information Extraction
Multi-domain Conversations
Graph-structured Dialogue Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Neural Network
Mixture-of-Experts
ReAct Agents
Dialogue State Tracking
Dynamic Routing
🔎 Similar Papers
No similar papers found.