Reinforcement Learning Improves LLM Accuracy and Reasoning in Disease Classification from Radiology Reports

📅 2026-04-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

183K/year
🤖 AI Summary
This work addresses the trade-off between classification accuracy and reasoning capability in radiology report disease classification, where supervised fine-tuning often improves accuracy at the expense of model interpretability. To reconcile this, the authors propose a two-stage approach: first, a lightweight large language model is fine-tuned with disease labels under supervision; subsequently, Group Relative Policy Optimization (GRPO) is applied to refine model outputs without requiring explicit reasoning annotations. This study presents the first application of GRPO to radiology text classification. Evaluated on three datasets annotated by radiologists, the method not only significantly outperforms baseline models in classification performance but also concurrently enhances reasoning recall and content comprehensiveness, achieving a synergistic improvement in both accuracy and reasoning quality.

Technology Category

Application Category

📝 Abstract
Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.
Problem

Research questions and friction points this paper is trying to address.

disease classification
radiology reports
large language models
reasoning
accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Group Relative Policy Optimization
Disease Classification
Radiology Reports
Reasoning Enhancement