Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Graph-to-sequence generation often suffers from factual inaccuracies and insensitivity to structural edits in the input graph. This work proposes DLM4G, a non-autoregressive graph-to-sequence framework based on diffusion mechanisms, which employs a graph-aware adaptive noise scheduling strategy to dynamically align graph elements with text tokens during iterative denoising. By introducing a noise modulation mechanism guided by per-token denoising errors, DLM4G significantly enhances graph structure fidelity and responsiveness to edits, while remaining applicable to scientific generation tasks such as molecular description. Experiments demonstrate that DLM4G outperforms diffusion baselines of comparable scale and even fine-tuned autoregressive models with up to 12 times more parameters. Compared to the strongest pretrained language model baseline, it achieves a 5.16% absolute improvement in factual consistency (FGT@0.5) and a 7.9% gain in edit sensitivity (ESR).

Technology Category

Application Category

📝 Abstract

Fine-tuned autoregressive models for graph-to-sequence generation (G2S) often struggle with factual grounding and edit sensitivity. To tackle these issues, we propose a non-autoregressive diffusion framework that generates text by iterative refinement conditioned on an input graph, named as Diffusion Language Model for Graphs (DLM4G). By aligning graph components (entities/relations) with their corresponding sequence tokens, DLM4G employs an adaptive noising strategy. The proposed strategy uses per-token denoising error as a signal to adaptively modulate noise on entity and relation tokens, improving preservation of graph structure and enabling localized updates under graph edits. Evaluated on three datasets, DLM4G consistently outperforms competitive G2S diffusion baselines trained on identical splits across both surface-form and embedding-based metrics. DLM4G further exceeds fine-tuned autoregressive baselines up to 12x larger (e.g., T5-Large) and is competitive with zero-shot LLM transfer baselines up to 127x larger. Relative to the strongest fine-tuned PLM baseline, DLM4G improves factual grounding (FGT@0.5) by +5.16% and edit sensitivity (ESR) by +7.9%; compared to the best diffusion baseline, it yields gains of +3.75% in FGT@0.5 and +23.6% in ESR. We additionally demonstrate applicability beyond textual graphs through experiments on molecule captioning, indicating the method's generality for scientific G2S generation.

Problem

Research questions and friction points this paper is trying to address.

graph-to-sequence generation

factual grounding

edit sensitivity

adaptive noising

diffusion language model

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-to-sequence generation

diffusion language model

adaptive noising