🤖 AI Summary
This study investigates the impact of collaboration between radiologists and an FDA-approved AI system for diagnosing pulmonary embolism in real-world clinical settings. Leveraging longitudinal observational data from nearly 400 radiologists interpreting over 100,000 CT scans—integrated with electronic health records, AI usage logs, and imaging follow-ups—the authors employ cohort analysis and heterogeneity modeling to reveal, for the first time, the dynamic evolution of human-AI collaboration. The findings indicate high diagnostic agreement with AI assistance (97% for negative and 84% for positive cases), yet a doubling of workload without improvements in diagnostic speed or patient mortality. AI utilization exhibits a nonlinear relationship with diagnostic consistency, with moderate users achieving optimal performance. Additionally, factors such as gender significantly influence AI adoption behavior.
📝 Abstract
We study how radiologists use AI to diagnose pulmonary embolism (PE), tracking over 100,000 scans interpreted by nearly 400 radiologists during the staggered rollout of a real-world FDA-approved diagnostic platform in a hospital system. When AI flags PE, radiologists agree 84% of the time; when AI predicts no PE, they agree 97%. Disagreement evolves substantially: radiologists initially reject AI-positive PEs in 30% of cases, dropping to 12% by year two. Despite a 16% increase in scan volume, diagnostic speed remains stable while per-radiologist monthly volumes nearly double, with no change in patient mortality -- suggesting AI improves workflow without compromising outcomes. We document significant heterogeneity in AI collaboration: some radiologists reject AI-flagged PEs half the time while others accept nearly always; female radiologists are 6 percentage points less likely to override AI than male radiologists. Moderate AI engagement is associated with the highest agreement, whereas both low and high engagement show more disagreement. Follow-up imaging reveals that when radiologists override AI to diagnose PE, 54% of subsequent scans show both agreeing on no PE within 30 days.