π€ AI Summary
This work addresses the challenge in retrieval-augmented generation (RAG) where noisy or unreliable retrieved contexts often conflict with a modelβs internal knowledge, thereby degrading the output quality of diffusion-based language models. To mitigate this issue, the authors propose ARAM, a novel framework that introduces an adaptive guidance mechanism into mask-based diffusion RAG systems for the first time. ARAM dynamically modulates the strength of guidance during the denoising process by estimating the signal-to-noise ratio induced by distributional shifts from retrieved contexts, effectively balancing external evidence and internal knowledge without requiring additional training. Experimental results demonstrate that ARAM significantly outperforms existing RAG baselines across multiple knowledge-intensive question answering benchmarks, achieving substantial gains in answer accuracy.
π Abstract
Retrieval-Augmented Generation (RAG) improves factual grounding by incorporating external knowledge into language model generation. However, when retrieved context is noisy, unreliable, or inconsistent with the model's parametric knowledge, it introduces retrieval-prior conflicts that can degrade generation quality. While this problem has been studied in autoregressive language models, it remains largely unexplored in diffusion-based language models, where the iterative denoising process introduces unique challenges for integrating retrieved context. In this work, we propose Adaptive Retrieval-Augmented Masked Diffusion (ARAM), a training-free adaptive guidance framework for Masked Diffusion Models (MDMs) in RAG settings. ARAM dynamically calibrates the guidance scale during denoising according to the Signal-to-Noise Ratio (SNR) of the distributional shift induced by retrieved context. Intuitively, the model strengthens guidance when the retrieved context provides reliable corrective evidence and suppresses it when the contextual signal is noisy or non-supportive. Extensive experiments on multiple knowledge-intensive QA benchmarks show that ARAM improves overall QA performance over competitive RAG baselines.