AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs

📅 2026-04-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
Existing dialogue datasets struggle to jointly model short-term and long-term memory while lacking thematic coherence and speaker consistency, thereby limiting the fine-tuning and evaluation of large language models’ memory capabilities. To address this, this work proposes TopicGuidedChat—a modular, multi-agent framework that, for the first time, enables topic-guided dialogue generation without manual annotation. The framework automatically extracts knowledge graphs, identifies dialogue topics, constructs speaker personas, and simultaneously generates high-quality dialogues along with corresponding memory-oriented question-answer pairs. By integrating explicit long-term memory (in the form of knowledge graphs) with short-term conversational context, the resulting dataset substantially enhances model performance on memory-related question-answering tasks.

Technology Category

Application Category

📝 Abstract
Recent advancements in Large Language Models (LLMs) have improved their ability to process extended conversational contexts, yet fine-tuning and evaluating short- and long-term memories remain difficult due to the absence of datasets that encode both short- and long-term conversational history. Existing conversational datasets lack memory grounding, overlook topic continuity, or rely on costly human annotation. To address these gaps, we introduce AgenticAI-DialogGen, a modular agent-based framework that generates persona-grounded and topic-guided conversations without human supervision. The framework uses LLM agents to extract knowledge graphs, identify topics, build speaker personas, and simulate topic-guided conversations from unstructured conversations. A QA module generates memory-grounded Question Answer (QA) pairs drawn from short- and long-term conversational histories. We also generated a new dataset entitled, TopicGuidedChat (TGC), where long-term memory is encoded as speaker-specific knowledge graphs and short-term memory as newly generated topic-guided conversations. Evaluations depict that AgenticAI-DialogGen yields higher conversational quality and LLMs fine-tuned on TGC dataset achieve improved performance on memory-grounded QA tasks.
Problem

Research questions and friction points this paper is trying to address.

conversational memory
large language models
topic continuity
memory grounding
dialogue datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

AgenticAI-DialogGen
topic-guided conversation
memory-grounded QA
knowledge graph
persona-grounded dialogue