🤖 AI Summary
This study addresses the high computational cost of full-parameter fine-tuning in medical text summarization by systematically evaluating the performance and parameter efficiency of LoRA, Prompt Tuning, and full fine-tuning on the Flan-T5 model series using the PubMed dataset. Experimental results demonstrate that LoRA achieves a ROUGE-1 score of 43.52 ± 0.18 on Flan-T5-Large while training only 0.6% of the model’s parameters—significantly outperforming full fine-tuning, which yields 40.67 ± 0.21. These findings highlight the effectiveness of low-rank constraints as an implicit regularizer and challenge the conventional assumption that updating all model parameters is necessary for optimal performance in this domain.
📝 Abstract
Fine-tuning large language models for domain-specific tasks such as medical text summarization demands substantial computational resources. Parameter-efficient fine-tuning (PEFT) methods offer promising alternatives by updating only a small fraction of parameters. This paper compares three adaptation approaches-Low-Rank Adaptation (LoRA), Prompt Tuning, and Full Fine-Tuning-across the Flan-T5 model family on the PubMed medical summarization dataset. Through experiments with multiple random seeds, we demonstrate that LoRA consistently outperforms full fine-tuning, achieving 43.52 +/- 0.18 ROUGE-1 on Flan-T5-Large with only 0.6% trainable parameters compared to 40.67 +/- 0.21 for full fine-tuning. Sensitivity analyses examine the impact of LoRA rank and prompt token count. Our findings suggest the low-rank constraint provides beneficial regularization, challenging assumptions about the necessity of full parameter updates. Code is available at https://github.com/eracoding/llm-medical-summarization