🤖 AI Summary
Graph-to-sequence generation often suffers from factual inaccuracies and insensitivity to structural edits in the input graph. This work proposes DLM4G, a non-autoregressive graph-to-sequence framework based on diffusion mechanisms, which employs a graph-aware adaptive noise scheduling strategy to dynamically align graph elements with text tokens during iterative denoising. By introducing a noise modulation mechanism guided by per-token denoising errors, DLM4G significantly enhances graph structure fidelity and responsiveness to edits, while remaining applicable to scientific generation tasks such as molecular description. Experiments demonstrate that DLM4G outperforms diffusion baselines of comparable scale and even fine-tuned autoregressive models with up to 12 times more parameters. Compared to the strongest pretrained language model baseline, it achieves a 5.16% absolute improvement in factual consistency (FGT@0.5) and a 7.9% gain in edit sensitivity (ESR).
📝 Abstract
Fine-tuned autoregressive models for graph-to-sequence generation (G2S) often struggle with factual grounding and edit sensitivity. To tackle these issues, we propose a non-autoregressive diffusion framework that generates text by iterative refinement conditioned on an input graph, named as Diffusion Language Model for Graphs (DLM4G). By aligning graph components (entities/relations) with their corresponding sequence tokens, DLM4G employs an adaptive noising strategy. The proposed strategy uses per-token denoising error as a signal to adaptively modulate noise on entity and relation tokens, improving preservation of graph structure and enabling localized updates under graph edits. Evaluated on three datasets, DLM4G consistently outperforms competitive G2S diffusion baselines trained on identical splits across both surface-form and embedding-based metrics. DLM4G further exceeds fine-tuned autoregressive baselines up to 12x larger (e.g., T5-Large) and is competitive with zero-shot LLM transfer baselines up to 127x larger. Relative to the strongest fine-tuned PLM baseline, DLM4G improves factual grounding (FGT@0.5) by +5.16% and edit sensitivity (ESR) by +7.9%; compared to the best diffusion baseline, it yields gains of +3.75% in FGT@0.5 and +23.6% in ESR. We additionally demonstrate applicability beyond textual graphs through experiments on molecule captioning, indicating the method's generality for scientific G2S generation.