🤖 AI Summary
Existing knowledge-augmented text generation methods rely on domain-specific retrievers, suffering from poor generalizability and limited interpretability. To address this, we propose a task-agnostic structured knowledge acquisition framework featuring a dual-layer knowledge architecture—comprising high-level and low-level components—that jointly models fine-grained semantic alignment via local-global interaction mechanisms and enables high-fidelity, cross-task and cross-datatype knowledge retrieval and fusion through a hierarchical Transformer pointer network. Crucially, the framework deeply couples knowledge representation learning with the generation process, preserving language model fluency while substantially enhancing output transparency and credibility. Extensive experiments on RotoWireFG (table-to-text) and KdConv (dialogue response generation) demonstrate state-of-the-art performance in both automatic and human evaluations, with simultaneous improvements in generation quality and interpretability.
📝 Abstract
Knowledge-enhanced text generation aims to enhance the quality of generated text by utilizing internal or external knowledge sources. While language models have demonstrated impressive capabilities in generating coherent and fluent text, the lack of interpretability presents a substantial obstacle. The limited interpretability of generated text significantly impacts its practical usability, particularly in knowledge-enhanced text generation tasks that necessitate reliability and explainability. Existing methods often employ domain-specific knowledge retrievers that are tailored to specific data characteristics, limiting their generalizability to diverse data types and tasks. To overcome this limitation, we directly leverage the two-tier architecture of structured knowledge, consisting of high-level entities and low-level knowledge triples, to design our task-agnostic structured knowledge hunter. Specifically, we employ a local-global interaction scheme for structured knowledge representation learning and a hierarchical transformer-based pointer network as the backbone for selecting relevant knowledge triples and entities. By combining the strong generative ability of language models with the high faithfulness of the knowledge hunter, our model achieves high interpretability, enabling users to comprehend the model output generation process. Furthermore, we empirically demonstrate the effectiveness of our model in both internal knowledge-enhanced table-to-text generation on the RotoWireFG dataset and external knowledge-enhanced dialogue response generation on the KdConv dataset. Our task-agnostic model outperforms state-of-the-art methods and corresponding language models, setting new standards on the benchmark.