🤖 AI Summary
Large language models (LLMs) often generate inaccurate citations in scientific writing and deviate from human authors’ citation practices. To address this, we propose “citation attribution alignment”—a novel paradigm that reframes citation evaluation as a task of determining whether a generated citation matches the actual citation choice a human author would make. We introduce CiteGuard, a retrieval-aware proxy evaluation framework that integrates LLM reasoning with external knowledge retrieval. Through multi-step verification, CiteGuard achieves fine-grained alignment between generated citations and human citation behavior, while also supporting identification of semantically equivalent yet syntactically distinct valid alternatives. Evaluated on the CiteME benchmark, CiteGuard achieves 65.4% accuracy—approaching human-level performance (69.7%) and surpassing the prior state-of-the-art by 12.3%. This advancement significantly improves both the reliability and interpretability of automated citation assessment.
📝 Abstract
Large Language Models (LLMs) have emerged as promising assistants for scientific writing. However, there have been concerns regarding the quality and reliability of the generated text, one of which is the citation accuracy and faithfulness. While most recent work relies on methods such as LLM-as-a-Judge, the reliability of LLM-as-a-Judge alone is also in doubt. In this work, we reframe citation evaluation as a problem of citation attribution alignment, which is assessing whether LLM-generated citations match those a human author would include for the same text. We propose CiteGuard, a retrieval-aware agent framework designed to provide more faithful grounding for citation validation. CiteGuard improves the prior baseline by 12.3%, and achieves up to 65.4% accuracy on the CiteME benchmark, on par with human-level performance (69.7%). It also enables the identification of alternative but valid citations.