AgenticAI-DialogGen: Topic-Guided Conversation Generation for Fine-Tuning and Evaluating Short- and Long-Term Memories of LLMs

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing dialogue datasets struggle to jointly model short-term and long-term memory while lacking thematic coherence and speaker consistency, thereby limiting the fine-tuning and evaluation of large language models’ memory capabilities. To address this, this work proposes TopicGuidedChat—a modular, multi-agent framework that, for the first time, enables topic-guided dialogue generation without manual annotation. The framework automatically extracts knowledge graphs, identifies dialogue topics, constructs speaker personas, and simultaneously generates high-quality dialogues along with corresponding memory-oriented question-answer pairs. By integrating explicit long-term memory (in the form of knowledge graphs) with short-term conversational context, the resulting dataset substantially enhances model performance on memory-related question-answering tasks.

Technology Category

Application Category

📝 Abstract

Recent advancements in Large Language Models (LLMs) have improved their ability to process extended conversational contexts, yet fine-tuning and evaluating short- and long-term memories remain difficult due to the absence of datasets that encode both short- and long-term conversational history. Existing conversational datasets lack memory grounding, overlook topic continuity, or rely on costly human annotation. To address these gaps, we introduce AgenticAI-DialogGen, a modular agent-based framework that generates persona-grounded and topic-guided conversations without human supervision. The framework uses LLM agents to extract knowledge graphs, identify topics, build speaker personas, and simulate topic-guided conversations from unstructured conversations. A QA module generates memory-grounded Question Answer (QA) pairs drawn from short- and long-term conversational histories. We also generated a new dataset entitled, TopicGuidedChat (TGC), where long-term memory is encoded as speaker-specific knowledge graphs and short-term memory as newly generated topic-guided conversations. Evaluations depict that AgenticAI-DialogGen yields higher conversational quality and LLMs fine-tuned on TGC dataset achieve improved performance on memory-grounded QA tasks.

Problem

Research questions and friction points this paper is trying to address.

conversational memory

large language models

topic continuity

memory grounding

dialogue datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

AgenticAI-DialogGen

topic-guided conversation

memory-grounded QA