🤖 AI Summary
To address the limited interpretability of deep learning models and their low clinical adoption in radiology report generation, this paper proposes a synergistic framework integrating Concept Bottleneck Models (CBMs) with multi-agent Retrieval-Augmented Generation (RAG). Methodologically: (1) it establishes a two-stage interpretable paradigm mediated by clinically meaningful concepts—explicitly modeling the mapping from visual features → medical concepts → diagnostic decisions; (2) it designs a multi-agent RAG system to ensure traceability and content provenance throughout report generation. The key contribution lies in the first-ever deep integration of CBMs and multi-agent RAG, enabling both classification-path visualization and end-to-end explanation of report generation. Evaluated on the COVID-QU dataset, the framework achieves 81% classification accuracy and attains 84–90% across five key structured-report metrics. LLM-as-a-judge evaluation further confirms its superior clinical relevance and explanatory capability over baseline methods.
📝 Abstract
Deep learning has advanced medical image classification, but interpretability challenges hinder its clinical adoption. This study enhances interpretability in Chest X-ray (CXR) classification by using concept bottleneck models (CBMs) and a multi-agent Retrieval-Augmented Generation (RAG) system for report generation. By modeling relationships between visual features and clinical concepts, we create interpretable concept vectors that guide a multi-agent RAG system to generate radiology reports, enhancing clinical relevance, explainability, and transparency. Evaluation of the generated reports using an LLM-as-a-judge confirmed the interpretability and clinical utility of our model's outputs. On the COVID-QU dataset, our model achieved 81% classification accuracy and demonstrated robust report generation performance, with five key metrics ranging between 84% and 90%. This interpretable multi-agent framework bridges the gap between high-performance AI and the explainability required for reliable AI-driven CXR analysis in clinical settings. Our code is available at https://github.com/tifat58/IRR-with-CBM-RAG.git.