🤖 AI Summary
Addressing the challenge of generating accurate, coherent, and semantically complete long-form clinical discharge instructions from sparse admission records, this paper proposes R2AG: a reinforcement learning–based (PPO) reasoning path retriever that synergistically integrates medical knowledge graph–driven interpretable reasoning with large language model–based generation. Its key innovations include a group-relative objective (GRO) reward mechanism enabling cross-node reasoning transitions, and semantic alignment coupled with evidence-focused decoding to enhance generation reliability. Evaluated on the MIMIC-IV-Note dataset, R2AG significantly improves clinical validity (Jaccard similarity +18.7%) and textual quality (BLEU-4 +12.3%), effectively bridging semantic gaps in sparse inputs and reducing risks of clinical misjudgment.
📝 Abstract
Clinical note generation aims to automatically produce free-text summaries of a patient's condition and diagnostic process, with discharge instructions being a representative long-form example. While recent large language model (LLM)-based methods pre-trained on general clinical corpora show promise in clinical text generation, they fall short in producing long-form notes from limited patient information. In this paper, we propose R2AG, the first reinforced retriever for long-form discharge instruction generation based on pre-admission data. R2AG is trained with reinforcement learning to retrieve reasoning paths from a medical knowledge graph, providing explicit semantic guidance to the LLM. To bridge the information gap, we propose Group-Based Retriever Optimization (GRO) which improves retrieval quality with group-relative rewards, encouraging reasoning leaps for deeper inference by the LLM. Comprehensive experiments on the MIMIC-IV-Note dataset show that R2AG outperforms baselines in both clinical efficacy and natural language generation metrics. Further analysis reveals that R2AG fills semantic gaps in sparse input scenarios, and retrieved reasoning paths help LLMs avoid clinical misinterpretation by focusing on key evidence and following coherent reasoning.