Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Direct deployment of large language models (LLMs) for patient-facing applications poses significant safety and accountability risks in clinical settings. Method: This paper proposes a physician-centered medical AI collaboration paradigm—integrating AI deeply into clinical workflows rather than replacing human–patient interactions. Grounded in a two-phase clinical needs analysis, we construct DoctorFLAN, a large-scale Chinese medical instruction dataset covering 27 specialties and 22 task types, and release DoctorFLAN-test and DotaBench—the first physician-oriented evaluation benchmarks. Our methodology jointly incorporates clinical workflow-aligned task design, multi-turn dialogue modeling, and single-turn QA assessment. Contribution/Results: Fine-tuning on DoctorFLAN substantially improves the clinical expertise, safety, and workflow compatibility of over ten mainstream open-source LLMs in real-world medical scenarios, effectively addressing inherent limitations of patient-facing models in responsibility boundary definition and clinical contextual understanding.

Technology Category

Application Category

📝 Abstract

The rise of large language models (LLMs) has transformed healthcare by offering clinical guidance, yet their direct deployment to patients poses safety risks due to limited domain expertise. To mitigate this, we propose repositioning LLMs as clinical assistants that collaborate with experienced physicians rather than interacting with patients directly. We conduct a two-stage inspiration-feedback survey to identify real-world needs in clinical workflows. Guided by this, we construct DoctorFLAN, a large-scale Chinese medical dataset comprising 92,000 Q&A instances across 22 clinical tasks and 27 specialties. To evaluate model performance in doctor-facing applications, we introduce DoctorFLAN-test (550 single-turn Q&A items) and DotaBench (74 multi-turn conversations). Experimental results with over ten popular LLMs demonstrate that DoctorFLAN notably improves the performance of open-source LLMs in medical contexts, facilitating their alignment with physician workflows and complementing existing patient-oriented models. This work contributes a valuable resource and framework for advancing doctor-centered medical LLM development

Problem

Research questions and friction points this paper is trying to address.

Repositioning LLMs as clinical assistants for physicians

Identifying real-world clinical workflow needs through surveys

Developing datasets to evaluate doctor-facing medical AI performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Repositioning LLMs as clinical assistants for physicians

Creating DoctorFLAN dataset with 92,000 medical Q&A instances

Developing specialized benchmarks for doctor-facing AI evaluation

🔎 Similar Papers

No similar papers found.