🤖 AI Summary
Large language models (LLMs) frequently generate hallucinated outputs, undermining answer reliability. Existing attribution methods struggle to achieve real-time, tight alignment between answer generation and document provenance in retrieval-augmented generation (RAG), often introducing significant latency. To address this, we propose LoDIT—a novel framework that jointly models document identifiers (Doc-IDs) and token-level logits to enable simultaneous answer generation and fine-grained document attribution. LoDIT integrates Doc-ID tokenization, logits-driven contribution estimation, and dynamic aggregation, enabling on-the-fly quantification of each retrieved document’s contribution during decoding—thereby balancing faithfulness and inference efficiency. Evaluated on the Trust-Align benchmark, LoDIT significantly outperforms state-of-the-art methods in attribution accuracy and answer fidelity, while reducing end-to-end latency and demonstrating strong robustness across diverse RAG configurations.
📝 Abstract
Despite their impressive performances, Large Language Models (LLMs) remain prone to hallucination, which critically undermines their trustworthiness. While most of the previous work focused on tackling answer and attribution correctness, a recent line of work investigated faithfulness, with a focus on leveraging internal model signals to reflect a model's actual decision-making process while generating the answer. Nevertheless, these methods induce additional latency and have shown limitations in directly aligning token generation with attribution generation. In this paper, we introduce LoDIT, a method that jointly generates and faithfully attributes answers in RAG by leveraging specific token logits during generation. It consists of two steps: (1) marking the documents with specific token identifiers and then leveraging the logits of these tokens to estimate the contribution of each document to the answer during generation, and (2) aggregating these contributions into document attributions. Experiments on a trustworthiness-focused attributed text-generation benchmark, Trust-Align, show that LoDIT significantly outperforms state-of-the-art models on several metrics. Finally, an in-depth analysis of LoDIT shows both its efficiency in terms of latency and its robustness in different settings.