🤖 AI Summary
This study addresses the challenge of fostering clinician trust in large language model (LLM) recommendations for high-stakes oncology care, where conventional explainability methods have proven insufficient. The authors propose an atomic fact-checking mechanism that decomposes LLM-generated treatment suggestions into independently verifiable atomic facts and traces each back to authoritative clinical guidelines. In a randomized controlled trial involving 356 physicians, this approach significantly increased the proportion of clinicians who trusted the AI recommendations—from 26.9% to 66.5% (Cohen’s d = 0.94)—substantially outperforming existing transparency techniques. This work represents the first demonstration of verifiable, traceable, and trustworthy AI decision support in real-world, high-risk clinical settings.
📝 Abstract
Question: Does atomic fact-checking, which decomposes AI treatment recommendations into individually verifiable claims linked to source guideline documents, increase clinician trust compared to traditional explainability approaches?
Findings: In this randomized trial of 356 clinicians generating 7,476 trust ratings, atomic fact-checking produced a large effect on trust (Cohen's d = 0.94), increasing the proportion of clinicians expressing trust from 26.9% to 66.5%. Traditional transparency mechanisms showed a dose-response gradient of improvement over baseline (d = 0.25 to 0.50).
Meaning: Decomposing AI recommendations into individually verifiable claims linked to source guidelines produces substantially higher clinician trust than traditional explainability approaches in high-stakes clinical decisions.