🤖 AI Summary
To address critical clinical information loss in long-document medical summarization caused by contextual length limitations, this work pioneers the adaptation of LongFormer—a transformer architecture with sparse long-range self-attention—to the healthcare domain, enabling effective global dependency modeling and enhanced retention of salient clinical content. Methodologically, we extend the LongFormer architecture through supervised fine-tuning integrated with medical-domain adaptation techniques, and establish a dual-evaluation framework combining automated ROUGE metrics with expert clinical assessment. Experimental results demonstrate statistically significant improvements over RNN-, T5-, and BERT-based baselines across ROUGE-1, ROUGE-2, and ROUGE-L scores. Clinical expert evaluation further confirms substantial gains in factual completeness and grammatical correctness, though conciseness remains an area for refinement. This study establishes a scalable, high-fidelity paradigm for abstractive summarization of lengthy clinical texts.
📝 Abstract
This paper proposes a medical text summarization method based on LongFormer, aimed at addressing the challenges faced by existing models when processing long medical texts. Traditional summarization methods are often limited by short-term memory, leading to information loss or reduced summary quality in long texts. LongFormer, by introducing long-range self-attention, effectively captures long-range dependencies in the text, retaining more key information and improving the accuracy and information retention of summaries. Experimental results show that the LongFormer-based model outperforms traditional models, such as RNN, T5, and BERT in automatic evaluation metrics like ROUGE. It also receives high scores in expert evaluations, particularly excelling in information retention and grammatical accuracy. However, there is still room for improvement in terms of conciseness and readability. Some experts noted that the generated summaries contain redundant information, which affects conciseness. Future research will focus on further optimizing the model structure to enhance conciseness and fluency, achieving more efficient medical text summarization. As medical data continues to grow, automated summarization technology will play an increasingly important role in fields such as medical research, clinical decision support, and knowledge management.