🤖 AI Summary
Large language models are prone to hallucination in multi-step reasoning, and existing conformal prediction methods struggle to simultaneously ensure reliability and high truth retention. This work proposes the first fully differentiable coherent factuality framework, which constructs a dependency graph over output assertions and jointly verifies their logical consistency to enable end-to-end optimization of a scoring function. By integrating conformal prediction, dependency graph modeling, and differentiable relaxation techniques, the method strictly adheres to user-specified hallucination rate upper bounds (e.g., 10%) while substantially improving truth retention—achieving up to a 141% increase in assertion retention rate over baselines on two reasoning benchmarks.
📝 Abstract
Large Language Models (LLMs) frequently hallucinate, limiting their reliability in critical applications. Conformal Prediction (CP) addresses this by calibrating error rates on held-out data to provide statistically valid confidence guarantees. Recent work extends CP to LLM factuality to filter out risky claims, ensuring that hallucination rates remain below a user-specified level (e.g., 10%). While prior methods treat claims independently, Coherent Factuality extends to multi-step reasoning by representing outputs as dependency graphs and jointly validating claims with their logical ancestors. A key limitation is that Coherent Factuality is not differentiable, requiring hand-crafted scorers that at high reliability levels remove nearly 60% of true claims. We introduce Differentiable Coherent Factuality (DCF), a fully differentiable relaxation that enables learning improved scorers while provably recovering the original algorithm's guarantees. Experiments on two benchmark reasoning datasets demonstrate DCF achieves up to 141% improvement in claim retention while maintaining reliability guarantees, representing a significant step towards reliable conformal LLM systems.