🤖 AI Summary
This work addresses the challenge of detecting hallucinated citations generated by large language models in academic writing, a problem exacerbated by existing methods that rely on fragile parsing or incomplete retrieval and thus lack fine-grained discriminative capability. The authors propose CiteTracer, the first multi-agent cascaded detection framework capable of field-level citation verification, reframing hallucination detection as a 12-class truthfulness classification task. Integrating structured parsing (from PDFs and BibTeX), multi-source evidence retrieval (via academic search engines, web search, and URL scraping), and an expert routing mechanism, CiteTracer achieves 97.1% overall accuracy on a benchmark comprising 2,450 synthetic and 957 real-world hallucinated citations, with F1 scores of 97.0, 95.8, and 98.5 across three critical categories—significantly advancing the detection of ambiguous and fabricated references.
📝 Abstract
Large language models are increasingly used in scientific writing, yet they can fabricate citation-shaped references that appear plausible but fail bibliographic verification. Existing detectors often reduce verification to binary found/not-found decisions and rely on brittle parsing or incomplete retrieval, offering little field-level signal to auditors. We reframe citation hallucination detection as taxonomy-aligned field-level adjudication and introduce a 12-code taxonomy spanning Real, Potential, and Hallucinated citations. Based on this taxonomy, we build CiteTracer, a cascading multi-agent detector that extracts structured citations from PDF and BibTeX, retrieves evidence through cache lookup, URL fetch, scholar connectors, and web search, applies deterministic field matching, and routes ambiguous cases to class-specialist judgers. We release a benchmark of 2,450 synthetic citations built from real seeds with controlled LLM mutations, paired with 957 real-world fabricated citations drawn from ICLR 2026 and an anonymous conference desk-rejected submissions. CiteTracer reaches 97.1% accuracy on the synthetic benchmark, with class-level F1 scores of 97.0, 95.8, and 98.5 for Real, Potential, and Hallucinated, respectively, and detects 97.1% of fabrications on the real-world set without abstaining. Code: https://github.com/aaFrostnova/CiteTracer.