Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing large medical language models in open-ended clinical consultations, where they often lack proactive information gathering, systematic reasoning, and robust factual reliability—hindering their utility for clinical-grade decision support. To bridge this gap, we propose Baichuan-M3, a medically augmented large language model that unifies active questioning, multi-turn evidence-integrated reasoning, and adaptive hallucination suppression within a single framework, effectively emulating the structured diagnostic workflow of expert clinicians. Through a specialized training pipeline, Baichuan-M3 achieves state-of-the-art performance on HealthBench, HealthBench-Hallu, and ScanBench, demonstrating significantly superior clinical consultation quality, recommendation reasonableness, and safety compared to GPT-5.2. This advancement marks a pivotal shift in medical AI from passive question-answering toward active, clinically grounded reasoning.

Technology Category

Application Category

📝 Abstract
We introduce Baichuan-M3, a medical-enhanced large language model engineered to shift the paradigm from passive question-answering to active, clinical-grade decision support. Addressing the limitations of existing systems in open-ended consultations, Baichuan-M3 utilizes a specialized training pipeline to model the systematic workflow of a physician. Key capabilities include: (i) proactive information acquisition to resolve ambiguity; (ii) long-horizon reasoning that unifies scattered evidence into coherent diagnoses; and (iii) adaptive hallucination suppression to ensure factual reliability. Empirical evaluations demonstrate that Baichuan-M3 achieves state-of-the-art results on HealthBench, the newly introduced HealthBench-Hallu and ScanBench, significantly outperforming GPT-5.2 in clinical inquiry, advisory and safety. The models are publicly available at https://huggingface.co/collections/baichuan-inc/baichuan-m3.
Problem

Research questions and friction points this paper is trying to address.

clinical inquiry
medical decision-making
hallucination suppression
open-ended consultation
large language model
Innovation

Methods, ideas, or system contributions that make the work stand out.

clinical inquiry modeling
active decision support
long-horizon reasoning
adaptive hallucination suppression
medical large language model
🔎 Similar Papers
No similar papers found.
B
Baichuan-M3 Team
C
Chengfeng Dou
F
Fan Yang
F
Fei Li
J
Jiyuan Jia
Q
Qiang Ju
S
Shuai Wang
T
Tianpeng Li
X
Xiangrong Zeng
Yijie Zhou
Yijie Zhou
The Chinese University of Hong Kong, Shenzhen
Distributed OptimizationPrivacy Preserving
H
Hongda Zhang
J
Jinyang Tai
Linzhuang Sun
Linzhuang Sun
University of Chinese Academy of Sciences
Multimodal Reasoning
P
Peidong Guo
Yichuan Mo
Yichuan Mo
Ph.D. Candidate, Peking University
Trustworthy AITrustworthy LLMTrustworthy Diffusion Model
X
Xiaochuan Wang
H
Hengfu Cui
Z
Zhishou Zhang