🤖 AI Summary
This work addresses the limitations of existing clinical decision support systems, which suffer from high maintenance costs and poor generalization, as well as the shortcomings of large language models that, despite possessing medical knowledge, exhibit inadequate diagnostic reasoning and dynamic questioning capabilities. To bridge this gap, the authors propose a structured Clinical Diagnostic Reasoning Dataset (CDRD) along with a systematic construction pipeline, and introduce a two-stage training framework: first applying supervised fine-tuning (SFT) to instill foundational diagnostic skills, followed by reinforcement learning (RL) with a custom-designed reward function to refine interactive questioning and reasoning. A comprehensive evaluation benchmark integrating diagnostic accuracy and inquiry strategy is also developed. Experimental results demonstrate that the resulting model, Dr. Assistant, achieves state-of-the-art performance among open-source models and rivals closed-source systems, significantly enhancing clinical interview guidance.
📝 Abstract
Clinical Decision Support Systems (CDSSs) provide reasoning and inquiry guidance for physicians, yet they face notable challenges, including high maintenance costs and low generalization capability. Recently, Large Language Models (LLMs) have been widely adopted in healthcare due to their extensive knowledge reserves, retrieval, and communication capabilities. While LLMs show promise and excel at medical benchmarks, their diagnostic reasoning and inquiry skills are constrained. To mitigate this issue, we propose (1) Clinical Diagnostic Reasoning Data (CDRD) structure to capture abstract clinical reasoning logic, and a pipeline for its construction, and (2) the Dr. Assistant, a clinical diagnostic model equipped with clinical reasoning and inquiry skills. Its training involves a two-stage process: SFT, followed by RL with a tailored reward function. We also introduce a benchmark to evaluate both diagnostic reasoning and inquiry. Our experiments demonstrate that the Dr. Assistant outperforms open-source models and achieves competitive performance to closed-source models, providing an effective solution for clinical diagnostic inquiry guidance.