Baichuan-M1: Pushing the Medical Capability of Large Language Models

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The medical domain lacks high-quality, domain-specific large language models (LLMs) pretrained from scratch, primarily due to scarcity of high-fidelity, expert-annotated medical data and the intrinsic complexity of clinical knowledge. To address this gap, we introduce Baichuan-M1-14B—the first open-source, medical-dedicated LLM fully pretrained from scratch, departing from the conventional paradigm of fine-tuning general-purpose foundation models. Leveraging a meticulously curated, multi-source corpus of 20 trillion tokens spanning clinical guidelines, peer-reviewed literature, electronic health records, and biomedical ontologies, Baichuan-M1-14B employs full-scale pretraining with cross-domain capability co-optimization. This design preserves state-of-the-art performance in general competencies—including mathematical reasoning and code generation—while achieving substantial gains in medical question answering, diagnostic inference, and clinical decision support. Empirical evaluation demonstrates consistent superiority over all existing medical LLMs across standardized benchmarks.

Technology Category

Application Category

📝 Abstract
The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce. In particular, the development of highly efficient and practical LLMs for the medical domain is challenging due to the complexity of medical knowledge and the limited availability of high-quality data. To bridge this gap, we introduce Baichuan-M1, a series of large language models specifically optimized for medical applications. Unlike traditional approaches that simply continue pretraining on existing models or apply post-training to a general base model, Baichuan-M1 is trained from scratch with a dedicated focus on enhancing medical capabilities. Our model is trained on 20 trillion tokens and incorporates a range of effective training methods that strike a balance between general capabilities and medical expertise. As a result, Baichuan-M1 not only performs strongly across general domains such as mathematics and coding but also excels in specialized medical fields. We have open-sourced Baichuan-M1-14B, a mini version of our model, which can be accessed through the following links.
Problem

Research questions and friction points this paper is trying to address.

Enhancing medical capabilities in LLMs
Addressing scarcity of domain-specific medical LLMs
Optimizing LLMs for medical applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-specific medical LLM
Trained from scratch
20 trillion tokens training
🔎 Similar Papers
No similar papers found.
Bingning Wang
Bingning Wang
Baichuan Inc.
NLPQuestion AnsweringLarge language model
H
Haizhou Zhao
Baichuan Inc.
H
Huozhi Zhou
Baichuan Inc.
L
Liang Song
Baichuan Inc.
Mingyu Xu
Mingyu Xu
Bytedance
large language modelmachine learning
W
Wei Cheng
Baichuan Inc.
X
Xiangrong Zeng
Baichuan Inc.
Y
Yupeng Zhang
Baichuan Inc.
Yuqi Huo
Yuqi Huo
Bytedance Inc.
multi-modal pretraining
Z
Zecheng Wang
Baichuan Inc.
Zhengyun Zhao
Zhengyun Zhao
Tsinghua University
Large Language ModelInformation RetrievalMedical AI
D
Da Pan
Baichuan Inc.
F
Fan Yang
Baichuan Inc.
F
Fei Kou
Baichuan Inc.
F
Fei Li
Baichuan Inc.
F
Fuzhong Chen
Baichuan Inc.
G
Guosheng Dong
Baichuan Inc.
H
Han Liu
Baichuan Inc.
H
Hongda Zhang
Baichuan Inc.
J
Jin He
Baichuan Inc.
J
Jinjie Yang
Baichuan Inc.
K
Kangxi Wu
Baichuan Inc.
K
Kegeng Wu
Baichuan Inc.
L
Lei Su
Baichuan Inc.
L
Linlin Niu
Baichuan Inc.
Linzhuang Sun
Linzhuang Sun
University of Chinese Academy of Sciences
Multimodal Reasoning
M
Mang Wang
Baichuan Inc.
P
Pengcheng Fan
Baichuan Inc.
Q
Qianli Shen
Baichuan Inc.
R
Rihui Xin
Baichuan Inc.
S
Shunya Dang
Baichuan Inc.
S
Songchi Zhou
Baichuan Inc.
W
Weipeng Chen
Baichuan Inc.
W
Wenjing Luo
Baichuan Inc.
X
Xin Chen
Baichuan Inc.
X
Xin Men
Baichuan Inc.
X
Xionghai Lin
Baichuan Inc.
X
Xuezhen Dong
Baichuan Inc.
Y
Yan Zhang
Baichuan Inc.
Y
Yifei Duan
Baichuan Inc.
Y
Yuyan Zhou
Baichuan Inc.
Zhi Ma
Zhi Ma
China Mobile (Hangzhou) Information Technology Co., Ltd.
Edge Intelligence Deep Learning LLM
Z
Zhiying Wu
Baichuan Inc.