BianCang: A Traditional Chinese Medicine Large Language Model

📅 2024-11-17
🏛️ arXiv.org
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Existing large language models (LLMs) exhibit limited performance on Traditional Chinese Medicine (TCM) pattern differentiation and treatment (PDT) tasks, primarily due to the profound theoretical divergence between TCM and modern biomedicine, coupled with a scarcity of high-quality, structured TCM corpora. To address this, we propose the first LLM specifically designed for TCM PDT. Our method introduces a novel two-stage training paradigm: knowledge injection followed by clinical case–based instruction alignment. We construct ChP-TCM—a comprehensive dataset aligned with the *Chinese Pharmacopoeia: TCM Volume*—and curate multi-source, real-world hospital clinical case instruction data. Furthermore, we integrate domain-adaptive knowledge injection, standardized TCM terminology modeling, and heterogeneous corpus unified representation. Evaluated across 11 benchmarks and four core PDT tasks—syndrome identification, prescription recommendation, etiology analysis, and classical citation—we significantly outperform 29 baseline models. The code, datasets, and model are publicly released.

Technology Category

Application Category

📝 Abstract
The rise of large language models (LLMs) has driven significant progress in medical applications, including traditional Chinese medicine (TCM). However, current medical LLMs struggle with TCM diagnosis and syndrome differentiation due to substantial differences between TCM and modern medical theory, and the scarcity of specialized, high-quality corpora. This paper addresses these challenges by proposing BianCang, a TCM-specific LLM, using a two-stage training process that first injects domain-specific knowledge and then aligns it through targeted stimulation. To enhance diagnostic and differentiation capabilities, we constructed pre-training corpora, instruction-aligned datasets based on real hospital records, and the ChP-TCM dataset derived from the Pharmacopoeia of the People's Republic of China. We compiled extensive TCM and medical corpora for continuous pre-training and supervised fine-tuning, building a comprehensive dataset to refine the model's understanding of TCM. Evaluations across 11 test sets involving 29 models and 4 tasks demonstrate the effectiveness of BianCang, offering valuable insights for future research. Code, datasets, and models are available at https://github.com/QLU-NLP/BianCang.
Problem

Research questions and friction points this paper is trying to address.

Addresses TCM diagnosis challenges with specialized LLM
Overcomes scarcity of high-quality traditional Chinese medicine corpora
Enhances syndrome differentiation through domain-specific knowledge injection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage training process for TCM LLM
Domain knowledge injection and targeted alignment
Comprehensive TCM datasets from hospital records
🔎 Similar Papers
No similar papers found.
S
Sibo Wei
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
Xueping Peng
Xueping Peng
University of Technology Sydney
Data MiningMachine LearningHealthcare AnalyticsNLP
Y
Yi-fei Wang
Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, China
Jiasheng Si
Jiasheng Si
Qilu University of Technology (Shandong Academy of Sciences), China
NLPFact checkingFake News Detection
W
Weiyu Zhang
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
W
Wenpeng Lu
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
X
Xiaoming Wu
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
Y
Yinglong Wang
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China