LLM Agents Implement an NLG System from Scratch: Building Interpretable Rule-Based RDF-to-Text Generators

📅 2025-12-20

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

To address the reliance on supervised data, severe hallucination, and inefficient inference in RDF-to-text generation, this paper proposes a neuro-symbolic approach based on collaborative multi-role LLM agents performing automated programming. Without human annotation or model fine-tuning, LLM agents autonomously parse RDF semantics, induce logical rules, and generate an interpretable Python-based rule engine—enabling zero-shot, purely rule-driven text generation. Our key contribution is replacing backpropagation with LLM-agent collaboration that “simulates training” through iterative reasoning and code synthesis, yielding the first fully interpretable, rule-based generator with millisecond-scale CPU-only inference (100× speedup) and near-zero resource overhead. Evaluated on WebNLG and OpenDialKG, our method significantly reduces hallucination; while fluency slightly lags behind fine-tuned models, it exhibits superior generalization across unseen RDF structures and domains.

Technology Category

Application Category

📝 Abstract

We present a novel neurosymbolic framework for RDF-to-text generation, in which the model is "trained" through collaborative interactions among multiple LLM agents rather than traditional backpropagation. The LLM agents produce rule-based Python code for a generator for the given domain, based on RDF triples only, with no in-domain human reference texts. The resulting system is fully interpretable, requires no supervised training data, and generates text nearly instantaneously using only a single CPU. Our experiments on the WebNLG and OpenDialKG data show that outputs produced by our approach reduce hallucination, with only slight fluency penalties compared to finetuned or prompted language models

Problem

Research questions and friction points this paper is trying to address.

Develops a neurosymbolic framework for RDF-to-text generation

Uses LLM agents to create rule-based generators without human texts

Produces interpretable, fast systems that reduce hallucination in outputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM agents collaboratively generate rule-based Python code

No supervised training data required for domain adaptation

Neurosymbolic framework reduces hallucination in text generation

🔎 Similar Papers

LLM-based feature generation from text for interpretable machine learning