S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional radiology report generation suffers from linguistic redundancy, inconsistency, and fragmentation of clinical information due to reliance on predefined templates or label-based structured methods, often omitting nuanced clinical details. To address these limitations, this work introduces the first end-to-end structured report generation framework for chest X-ray interpretation. We construct MIMIC-STRUC, the first publicly available dataset explicitly modeling four clinically essential elements—disease name, anatomical location, severity level, and probability—in a unified manner. Our method employs a template-free, large language model–driven generation approach, eliminating rigid schema constraints. Furthermore, we propose S-Score, a fine-grained, clinically oriented evaluation metric grounded in radiological reasoning. Experiments demonstrate that our framework significantly outperforms both visual question answering (VQA)–based and template-based baselines in report accuracy, clinical consistency, and interpretability. S-Score achieves strong correlation with human expert assessment (r = 0.92), establishing a standardized paradigm for AI-powered radiology reporting.

Technology Category

Application Category

📝 Abstract
Radiology report generation (RRG) for diagnostic images, such as chest X-rays, plays a pivotal role in both clinical practice and AI. Traditional free-text reports suffer from redundancy and inconsistent language, complicating the extraction of critical clinical details. Structured radiology report generation (S-RRG) offers a promising solution by organizing information into standardized, concise formats. However, existing approaches often rely on classification or visual question answering (VQA) pipelines that require predefined label sets and produce only fragmented outputs. Template-based approaches, which generate reports by replacing keywords within fixed sentence patterns, further compromise expressiveness and often omit clinically important details. In this work, we present a novel approach to S-RRG that includes dataset construction, model training, and the introduction of a new evaluation framework. We first create a robust chest X-ray dataset (MIMIC-STRUC) that includes disease names, severity levels, probabilities, and anatomical locations, ensuring that the dataset is both clinically relevant and well-structured. We train an LLM-based model to generate standardized, high-quality reports. To assess the generated reports, we propose a specialized evaluation metric (S-Score) that not only measures disease prediction accuracy but also evaluates the precision of disease-specific details, thus offering a clinically meaningful metric for report quality that focuses on elements critical to clinical decision-making and demonstrates a stronger alignment with human assessments. Our approach highlights the effectiveness of structured reports and the importance of a tailored evaluation metric for S-RRG, providing a more clinically relevant measure of report quality.
Problem

Research questions and friction points this paper is trying to address.

Traditional radiology reports are redundant and inconsistent
Existing structured report methods lack expressiveness and omit details
Current evaluation metrics fail to assess clinically critical elements
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based model for standardized report generation
MIMIC-STRUC dataset with detailed clinical annotations
S-Score metric for clinically meaningful evaluation
🔎 Similar Papers
No similar papers found.
Y
Yingshu Li
University of Sydney, New South Wales 2006, Australia
Yunyi Liu
Yunyi Liu
The University of Sydney
LLMVQAVisual GroundingReport GenerationMedical Image
Z
Zhanyu Wang
University of Sydney, New South Wales 2006, Australia
X
Xinyu Liang
First Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou 510405, China
Lingqiao Liu
Lingqiao Liu
Associate Professor at the University of Adelaide
computer visionmachine learning
L
Lei Wang
University of Wollongong, New South Wales 2522, Australia
Luping Zhou
Luping Zhou
School of Electrical and Computer Engineering, University of Sydney
Medical ImagingComputer VisionMachine Learning