🤖 AI Summary
Current vision-language models struggle to simultaneously achieve accuracy, generalizability, and clinical auditability in multitask analysis of dental panoramic radiographs (OPGs). This work proposes the first multi-agent architecture for dental image analysis, integrating a hierarchical perception module, a specialized toolbox comprising spatial, detection, utility, and expert models, and an anatomy-constrained consensus mechanism. Furthermore, the authors introduce OPG-Bench, a structured reporting protocol enabling fine-grained evaluation and hallucination detection. The proposed method significantly outperforms existing models on both OPG-Bench and MMOral-OPG benchmarks, achieving state-of-the-art performance in structured report generation and visual question answering, thereby enhancing diagnostic interpretability and reliability.
📝 Abstract
Orthopantomograms (OPGs) are the standard panoramic radiograph in dentistry, used for full-arch screening across multiple diagnostic tasks. While Vision Language Models (VLMs) now allow multi-task OPG analysis through natural language, they underperform task-specific models on most individual tasks. Agentic systems that orchestrate specialized tools offer a path to both versatility and accuracy, this approach remains unexplored in the field of dental imaging. To address this gap, we propose OPGAgent, a multi-tool agentic system for auditable OPG interpretation. OPGAgent coordinates specialized perception modules with a consensus mechanism through three components: (1) a Hierarchical Evidence Gathering module that decomposes OPG analysis into global, quadrant, and tooth-level phases with dynamically invoking tools, (2) a Specialized Toolbox encapsulating spatial, detection, utility, and expert zoos, and (3) a Consensus Subagent that resolves conflicts through anatomical constraints. We further propose OPG-Bench, a structured-report protocol based on (Location, Field, Value) triples derived from real clinical reports, which enables a comprehensive review of findings and hallucinations, extending beyond the limitations of VQA indicators. On our OPG-Bench and the public MMOral-OPG benchmark, OPGAgent outperforms current dental VLMs and medical agent frameworks across both structured-report and VQA evaluation. Code will be released upon acceptance.