Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Direct deployment of large language models (LLMs) for patient-facing applications poses significant safety and accountability risks in clinical settings. Method: This paper proposes a physician-centered medical AI collaboration paradigm—integrating AI deeply into clinical workflows rather than replacing human–patient interactions. Grounded in a two-phase clinical needs analysis, we construct DoctorFLAN, a large-scale Chinese medical instruction dataset covering 27 specialties and 22 task types, and release DoctorFLAN-test and DotaBench—the first physician-oriented evaluation benchmarks. Our methodology jointly incorporates clinical workflow-aligned task design, multi-turn dialogue modeling, and single-turn QA assessment. Contribution/Results: Fine-tuning on DoctorFLAN substantially improves the clinical expertise, safety, and workflow compatibility of over ten mainstream open-source LLMs in real-world medical scenarios, effectively addressing inherent limitations of patient-facing models in responsibility boundary definition and clinical contextual understanding.

Technology Category

Application Category

📝 Abstract
The rise of large language models (LLMs) has transformed healthcare by offering clinical guidance, yet their direct deployment to patients poses safety risks due to limited domain expertise. To mitigate this, we propose repositioning LLMs as clinical assistants that collaborate with experienced physicians rather than interacting with patients directly. We conduct a two-stage inspiration-feedback survey to identify real-world needs in clinical workflows. Guided by this, we construct DoctorFLAN, a large-scale Chinese medical dataset comprising 92,000 Q&A instances across 22 clinical tasks and 27 specialties. To evaluate model performance in doctor-facing applications, we introduce DoctorFLAN-test (550 single-turn Q&A items) and DotaBench (74 multi-turn conversations). Experimental results with over ten popular LLMs demonstrate that DoctorFLAN notably improves the performance of open-source LLMs in medical contexts, facilitating their alignment with physician workflows and complementing existing patient-oriented models. This work contributes a valuable resource and framework for advancing doctor-centered medical LLM development
Problem

Research questions and friction points this paper is trying to address.

Repositioning LLMs as clinical assistants for physicians
Identifying real-world clinical workflow needs through surveys
Developing datasets to evaluate doctor-facing medical AI performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Repositioning LLMs as clinical assistants for physicians
Creating DoctorFLAN dataset with 92,000 medical Q&A instances
Developing specialized benchmarks for doctor-facing AI evaluation
🔎 Similar Papers
No similar papers found.
W
Wenya Xie
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
Q
Qingying Xiao
National Health Data Institute, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
Y
Yu Zheng
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
X
Xidong Wang
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
J
Junying Chen
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
Ke Ji
Ke Ji
PhD student, The Chinese University of Hong Kong, Shenzhen
Large Language ModelsAgentMathematical Reasoning
A
Anningzhe Gao
Shenzhen Research Institute of Big Data, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
Prayag Tiwari
Prayag Tiwari
Associate Professor, Halmstad University, Sweden
Artificial IntelligenceMachine LearningDeep LearningMultimodal InteractionQuantum Computing
Xiang Wan
Xiang Wan
Shenzhen Research Institute of Big Data
BioinformaticsData MiningBig Data Analysis
F
Feng Jiang
School of Data Science, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, 518172, Guangdong, China.
Benyou Wang
Benyou Wang
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
large language modelsnatural language processinginformation retrievalapplied machine learning