🤖 AI Summary
This study addresses the optimization of bed turnover in elective spine surgery wards by predicting the probability of patient discharge on the following day using postoperative clinical notes. Thirteen approaches were systematically evaluated, including traditional models such as TF-IDF combined with XGBoost or LightGBM (LGBM), as well as lightweight large language models (LLMs) like LoRA-finetuned DistilGPT-2 and Bio_ClinicalBERT. Results demonstrate that, under resource-constrained and class-imbalanced clinical conditions, traditional models—characterized by strong interpretability and low computational overhead—outperform LLMs. Among all methods, TF-IDF + LGBM achieved the best performance (F1 = 0.47, recall = 0.51, AUC-ROC = 0.80), significantly surpassing the finetuned LLMs, which, despite improved recall via LoRA adaptation, still underperformed relative to conventional approaches.
📝 Abstract
Timely discharge prediction is essential for optimizing bed turnover and resource allocation in elective spine surgery units. This study evaluates the feasibility of lightweight, fine-tuned large language models (LLMs) and traditional text-based models for predicting next-day discharge using postoperative clinical notes. We compared 13 models, including TF-IDF with XGBoost and LGBM, and compact LLMs (DistilGPT-2, Bio_ClinicalBERT) fine-tuned via LoRA. TF-IDF with LGBM achieved the best balance, with an F1-score of 0.47 for the discharge class, a recall of 0.51, and the highest AUC-ROC (0.80). While LoRA improved recall in DistilGPT2, overall transformer-based and generative models underperformed. These findings suggest interpretable, resource-efficient models may outperform compact LLMs in real-world, imbalanced clinical prediction tasks.