SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

This paper addresses the problem of sentence-level citation deficiencies—such as missing, mismatched, or unverifiable citations—in large language model (LLM) responses, which currently rely heavily on costly manual annotation. To this end, we propose a self-supervised sentence-level citation alignment framework. Methodologically, we introduce a novel model self-feedback–based context ablation reward mechanism: by automatically ablating supporting sentences, we generate fine-grained, annotation-free reward signals that jointly guide best-of-N sampling during inference and direct preference optimization (DPO) fine-tuning during training. The framework eliminates the need for human-annotated citations while simultaneously enhancing inference-time citation fidelity and improving model parameters. Evaluated on five long-context question-answering tasks in LongBench-Cite, our approach achieves up to a 5.3-point improvement in citation F1, significantly boosting citation accuracy, traceability, and reliability.

Technology Category

Application Category

📝 Abstract

We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks.

Problem

Research questions and friction points this paper is trying to address.

Improves sentence-level citation quality in LLMs

Uses self-supervised alignment for context attribution

Enhances citation accuracy via model self-reward

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised alignment for citations

Context ablation for reward signals

Preference optimization fine-tuning model

🔎 Similar Papers

No similar papers found.