🤖 AI Summary
This study addresses the clinical challenge of manual interpretation of dental panoramic radiographs (DPRs), hindered by anatomical superimposition and time constraints. We developed the first multi-task, cross-national generalizable AI system for tooth-level localization and classification of eight oral pathologies. Our method integrates YOLO- and DETR-based object detection with U-Net– and DeepLab–based semantic segmentation into a unified end-to-end framework. Validated on real-world clinical datasets from the Netherlands, Brazil, and Taiwan, the system achieves a macro-average AUC-ROC of 96.2% and processes images 79× faster than human experts. For 7 out of 8 pathologies, AI–reference standard agreement is non-inferior to inter-expert agreement; notably, sensitivity for apical radiolucencies improves by 67.9%. This work provides the first empirical evidence—across diverse national imaging protocols and patient populations—that AI-based dental diagnosis exhibits robust generalizability and clinical parity with human experts.
📝 Abstract
Dental panoramic radiographs (DPRs) are widely used in clinical practice for comprehensive oral assessment but present challenges due to overlapping structures and time constraints in interpretation. This study aimed to establish a solid baseline for the AI-automated assessment of findings in DPRs by developing, evaluating an AI system, and comparing its performance with that of human readers across multinational data sets. We analyzed 6,669 DPRs from three data sets (the Netherlands, Brazil, and Taiwan), focusing on 8 types of dental findings. The AI system combined object detection and semantic segmentation techniques for per-tooth finding identification. Performance metrics included sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC). AI generalizability was tested across data sets, and performance was compared with human dental practitioners. The AI system demonstrated comparable or superior performance to human readers, particularly +67.9% (95% CI: 54.0%-81.9%; p<.001) sensitivity for identifying periapical radiolucencies and +4.7% (95% CI: 1.4%-8.0%; p = .008) sensitivity for identifying missing teeth. The AI achieved a macro-averaged AUC-ROC of 96.2% (95% CI: 94.6%-97.8%) across 8 findings. AI agreements with the reference were comparable to inter-human agreements in 7 of 8 findings except for caries (p = .024). The AI system demonstrated robust generalization across diverse imaging and demographic settings and processed images 79 times faster (95% CI: 75-82) than human readers. The AI system effectively assessed findings in DPRs, achieving performance on par with or better than human experts while significantly reducing interpretation time. These results highlight the potential for integrating AI into clinical workflows to improve diagnostic efficiency and accuracy, and patient management.