Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This study addresses the abstract summarization task from “Imaging Findings” to “Impression” in radiology reports. Leveraging the MIMIC-CXR dataset, it presents the first systematic evaluation of open-source large language models (LLaMA-3-8B, ChatGPT-4), classical sequence-to-sequence models (T5, BART, PEGASUS), and an enhanced Pointer Generator network with coverage mechanism for clinical text summarization. Performance is assessed using multi-dimensional metrics—ROUGE, METEOR, and BERTScore—to quantify semantic fidelity and coverage of critical diagnostic entities. Results show that LLaMA-3-8B and the improved Pointer Generator achieve superior clinical entity recall and logical coherence; all models improve ROUGE-L by 12.4–28.7% over baseline. This work establishes the first unified benchmark for radiology report summarization, empirically validating the adaptability of large language models to domain-specific summarization and identifying key optimization directions.

Technology Category

Application Category

📝 Abstract

The findings section of a radiology report is often detailed and lengthy, whereas the impression section is comparatively more compact and captures key diagnostic conclusions. This research explores the use of advanced abstractive summarization models to generate the concise impression from the findings section of a radiology report. We have used the publicly available MIMIC-CXR dataset. A comparative analysis is conducted on leading pre-trained and open-source large language models, including T5-base, BART-base, PEGASUS-x-base, ChatGPT-4, LLaMA-3-8B, and a custom Pointer Generator Network with a coverage mechanism. To ensure a thorough assessment, multiple evaluation metrics are employed, including ROUGE-1, ROUGE-2, ROUGE-L, METEOR, and BERTScore. By analyzing the performance of these models, this study identifies their respective strengths and limitations in the summarization of medical text. The findings of this paper provide helpful information for medical professionals who need automated summarization solutions in the healthcare sector.

Problem

Research questions and friction points this paper is trying to address.

Generating concise radiology impressions from detailed findings

Comparing abstractive summarization models for medical texts

Evaluating model performance using multiple metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses abstractive summarization for radiology reports

Compares multiple pre-trained large language models

Employs diverse metrics for thorough evaluation

🔎 Similar Papers

No similar papers found.