🤖 AI Summary
This study investigates how free-text reporting, structured reporting, and AI-assisted structured reporting affect radiologists’ interpretation of chest X-rays. Using eye-tracking and a custom阅片 platform, we quantified diagnostic accuracy, reporting efficiency, visual behavior (e.g., saccade count, dwell time on report regions), and user experience via generalized linear mixed models with Bonferroni correction, while examining the moderating effect of radiologist experience. Results show that AI-assisted structured reporting significantly improved diagnostic consistency (Cohen’s κ = 0.71), reduced reporting time (25 ± 9 seconds), lowered cognitive load (23% fewer saccades; 31% shorter dwell time in report regions), and received the highest user preference. Novice and experienced radiologists exhibited distinct attentional allocation patterns. This is the first study to elucidate the cognitive mechanisms underlying intelligent reporting systems using multimodal behavioral data, providing empirical evidence and theoretical grounding for human–AI collaborative reporting design in clinical practice.
📝 Abstract
Structured reporting (SR) and artificial intelligence (AI) may transform how radiologists interact with imaging studies. This prospective study (July to December 2024) evaluated the impact of three reporting modes: free-text (FT), structured reporting (SR), and AI-assisted structured reporting (AI-SR), on image analysis behavior, diagnostic accuracy, efficiency, and user experience. Four novice and four non-novice readers (radiologists and medical students) each analyzed 35 bedside chest radiographs per session using a customized viewer and an eye-tracking system. Outcomes included diagnostic accuracy (compared with expert consensus using Cohen's $κ$), reporting time per radiograph, eye-tracking metrics, and questionnaire-based user experience. Statistical analysis used generalized linear mixed models with Bonferroni post-hoc tests with a significance level of ($P le .01$). Diagnostic accuracy was similar in FT ($κ= 0.58$) and SR ($κ= 0.60$) but higher in AI-SR ($κ= 0.71$, $P < .001$). Reporting times decreased from $88 pm 38$ s (FT) to $37 pm 18$ s (SR) and $25 pm 9$ s (AI-SR) ($P < .001$). Saccade counts for the radiograph field ($205 pm 135$ (FT), $123 pm 88$ (SR), $97 pm 58$ (AI-SR)) and total fixation duration for the report field ($11 pm 5$ s (FT), $5 pm 3$ s (SR), $4 pm 1$ s (AI-SR)) were lower with SR and AI-SR ($P < .001$ each). Novice readers shifted gaze towards the radiograph in SR, while non-novice readers maintained their focus on the radiograph. AI-SR was the preferred mode. In conclusion, SR improves efficiency by guiding visual attention toward the image, and AI-prefilled SR further enhances diagnostic accuracy and user satisfaction.