🤖 AI Summary
This study addresses drug event detection and fine-grained classification in electronic health record (EHR) clinical text. We propose a multi-source pre-trained BERT ensemble framework that integrates several BERT variants—individually pre-trained on Wikipedia and the MIMIC-III corpus—and fine-tunes them on the CMED dataset, followed by weighted-voting-based ensemble prediction. Our key contribution lies in synergistically combining domain-adaptive pre-training with model-level ensemble learning to mitigate the limited generalization capacity of individual models. Under a rigorous evaluation protocol, our approach achieves absolute improvements of approximately 5.0% in Micro-F1 and 6.2% in Macro-F1 over strong baselines. The method delivers high-precision, interpretable automated information extraction, thereby advancing clinical decision support and pharmacovigilance systems.
📝 Abstract
Identification of key variables such as medications, diseases, relations from health records and clinical notes has a wide range of applications in the clinical domain. n2c2 2022 provided shared tasks on challenges in natural language processing for clinical data analytics on electronic health records (EHR), where it built a comprehensive annotated clinical data Contextualized Medication Event Dataset (CMED). This study focuses on subtask 2 in Track 1 of this challenge that is to detect and classify medication events from clinical notes through building a novel BERT-based ensemble model. It started with pretraining BERT models on different types of big data such as Wikipedia and MIMIC. Afterwards, these pretrained BERT models were fine-tuned on CMED training data. These fine-tuned BERT models were employed to accomplish medication event classification on CMED testing data with multiple predictions. These multiple predictions generated by these fine-tuned BERT models were integrated to build final prediction with voting strategies. Experimental results demonstrated that BERT-based ensemble models can effectively improve strict Micro-F score by about 5% and strict Macro-F score by about 6%, respectively.