🤖 AI Summary
Existing conversational agents struggle to effectively integrate heterogeneous auxiliary data—such as knowledge bases and user personas—due to information redundancy, weak cross-source associations, and poor factual consistency, limiting adaptability to diverse user preferences and belief systems. To address this, we propose a sparse symmetric latent interaction mechanism coupled with bidirectional word-level similarity measurement, enabling selective, word-level information sharing across sources and post-fusion grounded modeling. We further design dedicated encoding streams and a grounding network to enforce cross-source semantic alignment and factual constraints. Evaluated on persona modeling and knowledge prediction tasks, our approach significantly outperforms two state-of-the-art baselines, achieving simultaneous improvements in response factual accuracy and naturalness. The method offers a novel, interpretable, and scalable paradigm for multi-source-augmented dialogue generation.
📝 Abstract
Recent advancements in AI-driven conversational agents have exhibited immense potential of AI applications. Effective response generation is crucial to the success of these agents. While extensive research has focused on leveraging multiple auxiliary data sources (e.g., knowledge bases and personas) to enhance response generation, existing methods often struggle to efficiently extract relevant information from these sources. There are still clear limitations in the ability to combine versatile conversational capabilities with adherence to known facts and adaptation to large variations in user preferences and belief systems, which continues to hinder the wide adoption of conversational AI tools. This paper introduces a novel method, Conversational Agent for Multi-Source Auxiliary Context with Sparse and Symmetric Latent Interactions (CoMAC), for conversation generation, which employs specialized encoding streams and post-fusion grounding networks for multiple data sources to identify relevant persona and knowledge information for the conversation. CoMAC also leverages a novel text similarity metric that allows bi-directional information sharing among multiple sources and focuses on a selective subset of meaningful words. Our experiments show that CoMAC improves the relevant persona and knowledge prediction accuracies and response generation quality significantly over two state-of-the-art methods.