🤖 AI Summary
AI question-answering systems frequently provide unverifiable citations, undermining answer credibility. To address this, we propose “Attribution Gradients,” a method that constructs progressive, claim-level provenance paths from AI-generated answers back to original scientific literature. It decomposes answers into fine-grained claims, automatically retrieves supporting or contradictory textual excerpts from source documents, and enables interactive, hierarchical citation exploration—including inline expansion and direct document navigation. The approach integrates text mining with purpose-built interactive interface design, grounded in a curated scientific literature corpus to ensure evidence-level auditability. A user study demonstrates that Attribution Gradients significantly increase users’ depth of source document examination compared to baseline systems and elicit substantially more evidence-based, critical revisions during AI answer refinement tasks. This work presents the first claim-driven, multi-level, interactive attribution verification framework for AI-generated citations.
📝 Abstract
AI question answering systems increasingly generate responses with attributions to sources. However, the task of verifying the actual content of these attributions is in most cases impractical. In this paper, we present attribution gradients as a solution. Attribution gradients provide integrated, incremental affordances for diving into an attributed passage. A user can decompose a sentence of an answer into its claims. For each claim, the user can view supporting and contradictory excerpts mined from sources. Those excerpts serve as clickable conduits into the source (in our application, scientific papers). When evidence itself contains more citations, the UI unpacks the evidence into excerpts from the cited sources. These features of attribution gradients facilitate concurrent interconnections among answer, claim, excerpt, and context. In a usability study, we observed greater engagement with sources and richer revision in a task where participants revised an attributed AI answer with attribution gradients and a baseline.