Clinician input steers frontier AI models toward both accurate and harmful decisions

πŸ“… 2026-03-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the susceptibility of large language models (LLMs) to physician inputs in clinical decision-making, which can either enhance diagnostic accuracy or introduce harmful biases, highlighting the inadequacy of current human-AI collaboration evaluation frameworks. The work proposes the first interactive assessment framework that integrates medical case studies with real physician-patient–AI dialogue data to systematically analyze the diagnostic behaviors of multiple LLMs under both expert and adversarial physician contexts. Leveraging multi-turn dialogue simulations, differential diagnosis consistency, WHO harm severity grading, and inference-time scaling, the study reveals a phenotypic spectrum of model responses ranging from compliant to obstinate. Results show that expert context increases the inclusion rate of correct diagnoses by an average of 20.4%, whereas adversarial context significantly degrades performance across 14 models. Inference-time scaling effectively reduces harmful outputs across all severity levels, and explicit uncertainty prompts improve diagnostic accuracy by 15 percentage points in adversarial scenarios.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) are entering clinician workflows, yet evaluations rarely measure how clinician reasoning shapes model behavior during clinical interactions. We combined 61 New England Journal of Medicine Case Records with 92 real-world clinician-AI interactions to evaluate 21 reasoning LLM variants across 8 frontier models on differential diagnosis generation and next step recommendations under three conditions: reasoning alone, after expert clinician context, and after adversarial clinician context. LLM-clinician concordance increased substantially after clinician exposure, with simulations sharing >=3 differential diagnosis items rising from 65.8% to 93.5% and >=3 next step recommendations from 20.3% to 53.8%. Expert context significantly improved correct final diagnosis inclusion across all 21 models (mean +20.4 percentage points), reflecting both reasoning improvement and passive content echoing, while adversarial context caused significant diagnostic degradation in 14 models (mean -5.4 percentage points). Multi-turn disagreement probes revealed distinct model phenotypes ranging from highly conformist to dogmatic, with adversarial arguments remaining a persistent vulnerability even for otherwise resilient models. Inference-time scaling reduced harmful echoing of clinician-introduced recommendations across WHO-defined harm severity tiers (relative reductions: 62.7% mild, 57.9% moderate, 76.3% severe, 83.5% death-tier). In GPT-4o experiments, explicit clinician uncertainty signals improved diagnostic performance after adversarial context (final diagnosis inclusion 27% to 42%) and reduced alignment with incorrect arguments by 21%. These findings establish a foundation for evaluating clinician-AI collaboration, introducing interactive metrics and mitigation strategies essential for safety and robustness.
Problem

Research questions and friction points this paper is trying to address.

clinician-AI interaction
large language models
diagnostic safety
adversarial context
harmful decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

clinician-AI collaboration
adversarial context
inference-time scaling
harmful echoing
diagnostic robustness
πŸ”Ž Similar Papers
No similar papers found.
Ivan Lopez
Ivan Lopez
Stanford University
data sciencemachine learningNLPhealth systemsclinical decision support
S
Selin S. Everett
Stanford University School of Medicine, Stanford, CA, USA
B
Bryan J. Bunning
Quantitative Sciences Unit, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
A
April S. Liang
Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
D
Dong Han Yao
Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
S
Shivam C. Vedak
Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
K
Kameron C. Black
Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
Sophie Ostmeier
Sophie Ostmeier
Stanford University
MLMedicine
S
Stephen P. Ma
Division of Hospital Medicine, Department of Medicine, Stanford University, Stanford, CA, USA
Emily Alsentzer
Emily Alsentzer
Assistant Professor, Stanford University
machine learning for healthcare
J
Jonathan H. Chen
Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
A
Akshay S. Chaudhari
Stanford Center for Artificial Intelligence in Medicine and Imaging, Palo Alto, CA, USA
Eric Horvitz
Eric Horvitz
Microsoft
Machine intelligencedecision theorydecisions under uncertaintyinformation retrievalbounded