🤖 AI Summary
Existing large language models (LLMs) exhibit limited performance on Traditional Chinese Medicine (TCM) pattern differentiation and treatment (PDT) tasks, primarily due to the profound theoretical divergence between TCM and modern biomedicine, coupled with a scarcity of high-quality, structured TCM corpora. To address this, we propose the first LLM specifically designed for TCM PDT. Our method introduces a novel two-stage training paradigm: knowledge injection followed by clinical case–based instruction alignment. We construct ChP-TCM—a comprehensive dataset aligned with the *Chinese Pharmacopoeia: TCM Volume*—and curate multi-source, real-world hospital clinical case instruction data. Furthermore, we integrate domain-adaptive knowledge injection, standardized TCM terminology modeling, and heterogeneous corpus unified representation. Evaluated across 11 benchmarks and four core PDT tasks—syndrome identification, prescription recommendation, etiology analysis, and classical citation—we significantly outperform 29 baseline models. The code, datasets, and model are publicly released.
📝 Abstract
The rise of large language models (LLMs) has driven significant progress in medical applications, including traditional Chinese medicine (TCM). However, current medical LLMs struggle with TCM diagnosis and syndrome differentiation due to substantial differences between TCM and modern medical theory, and the scarcity of specialized, high-quality corpora. This paper addresses these challenges by proposing BianCang, a TCM-specific LLM, using a two-stage training process that first injects domain-specific knowledge and then aligns it through targeted stimulation. To enhance diagnostic and differentiation capabilities, we constructed pre-training corpora, instruction-aligned datasets based on real hospital records, and the ChP-TCM dataset derived from the Pharmacopoeia of the People's Republic of China. We compiled extensive TCM and medical corpora for continuous pre-training and supervised fine-tuning, building a comprehensive dataset to refine the model's understanding of TCM. Evaluations across 11 test sets involving 29 models and 4 tasks demonstrate the effectiveness of BianCang, offering valuable insights for future research. Code, datasets, and models are available at https://github.com/QLU-NLP/BianCang.