Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

In chest X-ray interpretation, perceptual errors—specifically “missed findings after detection” (i.e., post-detection omissions)—are frequent and clinically consequential; yet current AI systems lack effective support for mitigating human-factor errors in the post-interpretation phase. To address this, we propose RADAR: the first AI collaborator explicitly designed to assist radiologists in identifying perceptual errors during post-interpretation review. RADAR employs region-level attention modeling and multi-scale feature fusion to learn simulated perceptual error patterns under weak supervision. Critically, it introduces an observer-variability-aware ROI suggestion mechanism that provides interpretable, non-intrusive decision support—bypassing rigid ground-truth annotations and accommodating inter-radiologist interpretive variability. Evaluated on a synthetic perceptual-error dataset, RADAR achieves a recall of 0.78, F1-score of 0.56, and median IoU of 0.78 (with 90% of ROIs attaining IoU > 0.5). The system, dataset, and interactive demo are fully open-sourced.

Technology Category

Application Category

📝 Abstract

Chest radiography is widely used in diagnostic imaging. However, perceptual errors -- especially overlooked but visible abnormalities -- remain common and clinically significant. Current workflows and AI systems provide limited support for detecting such errors after interpretation and often lack meaningful human--AI collaboration. We introduce RADAR (Radiologist--AI Diagnostic Assistance and Review), a post-interpretation companion system. RADAR ingests finalized radiologist annotations and CXR images, then performs regional-level analysis to detect and refer potentially missed abnormal regions. The system supports a"second-look"workflow and offers suggested regions of interest (ROIs) rather than fixed labels to accommodate inter-observer variation. We evaluated RADAR on a simulated perceptual-error dataset derived from de-identified CXR cases, using F1 score and Intersection over Union (IoU) as primary metrics. RADAR achieved a recall of 0.78, precision of 0.44, and an F1 score of 0.56 in detecting missed abnormalities in the simulated perceptual-error dataset. Although precision is moderate, this reduces over-reliance on AI by encouraging radiologist oversight in human--AI collaboration. The median IoU was 0.78, with more than 90% of referrals exceeding 0.5 IoU, indicating accurate regional localization. RADAR effectively complements radiologist judgment, providing valuable post-read support for perceptual-error detection in CXR interpretation. Its flexible ROI suggestions and non-intrusive integration position it as a promising tool in real-world radiology workflows. To facilitate reproducibility and further evaluation, we release a fully open-source web implementation alongside a simulated error dataset. All code, data, demonstration videos, and the application are publicly available at https://github.com/avutukuri01/RADAR.

Problem

Research questions and friction points this paper is trying to address.

Detects overlooked abnormalities in chest radiographs post-interpretation

Addresses interobserver variability with flexible ROI suggestions

Enhances human-AI collaboration for perceptual error reduction

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-assisted post-interpretation error detection

Regional-level analysis for missed abnormalities

Flexible ROI suggestions for interobserver variation

🔎 Similar Papers

No similar papers found.