Multi-Agent LLMs Ensemble for Efficient Atrial Fibrillation Annotation of ECG Reports

📅 2024-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the time-consuming, costly, and error-prone nature of manual annotation for electronic health record (EHR) data—particularly electrocardiogram (ECG) reports and clinical notes—this work proposes a multi-agent large language model (LLM) integration system. Methodologically, it introduces the first collaborative reasoning framework leveraging diverse open-source LLMs, combined with structured EHR text parsing and a majority-voting mechanism incorporating a minimal win-threshold to effectively mitigate hallucination. The system generalizes across tasks, simultaneously supporting atrial fibrillation diagnosis labeling and social determinants of health (SDOH) identification. Evaluated on MIMIC-IV, it automatically annotated 623,000 ECG reports with 98.2% accuracy and achieved state-of-the-art performance on SDOH extraction from 1,405 clinical notes. Experiments demonstrate that the ensemble strategy significantly outperforms the best single-model baselines—including commercial closed-source LLMs—establishing a new paradigm for scalable, robust, and efficient large-scale EHR annotation.

Technology Category

Application Category

📝 Abstract
This study introduces a novel multiagent ensemble method powered by LLMs to address a key challenge in ML - data labeling, particularly in large-scale EHR datasets. Manual labeling of such datasets requires domain expertise and is labor-intensive, time-consuming, expensive, and error-prone. To overcome this bottleneck, we developed an ensemble LLMs method and demonstrated its effectiveness in two real-world tasks: (1) labeling a large-scale unlabeled ECG dataset in MIMIC-IV; (2) identifying social determinants of health (SDOH) from the clinical notes of EHR. Trading off benefits and cost, we selected a pool of diverse open source LLMs with satisfactory performance. We treat each LLM's prediction as a vote and apply a mechanism of majority voting with minimal winning threshold for ensemble. We implemented an ensemble LLMs application for EHR data labeling tasks. By using the ensemble LLMs and natural language processing, we labeled MIMIC-IV ECG dataset of 623,566 ECG reports with an estimated accuracy of 98.2%. We applied the ensemble LLMs method to identify SDOH from social history sections of 1,405 EHR clinical notes, also achieving competitive performance. Our experiments show that the ensemble LLMs can outperform individual LLM even the best commercial one, and the method reduces hallucination errors. From the research, we found that (1) the ensemble LLMs method significantly reduces the time and effort required for labeling large-scale EHR data, automating the process with high accuracy and quality; (2) the method generalizes well to other text data labeling tasks, as shown by its application to SDOH identification; (3) the ensemble of a group of diverse LLMs can outperform or match the performance of the best individual LLM; and (4) the ensemble method substantially reduces hallucination errors. This approach provides a scalable and efficient solution to data-labeling challenges.
Problem

Research questions and friction points this paper is trying to address.

Automating large-scale ECG report labeling using ensemble LLMs
Reducing manual effort in EHR data annotation with AI
Improving accuracy and reducing errors in SDOH identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLMs ensemble for efficient ECG annotation
Majority voting mechanism minimizes hallucination errors
Open source LLMs achieve high labeling accuracy
J
Jingwei Huang
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
K
K. Nezafati
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
I
Ismael Villanueva-Miranda
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
Z
Zifan Gu
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
A
A. Navar
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390; Department of Internal Medicine, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
T
Tingyi Wanyan
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
Qin Zhou
Qin Zhou
East China University of Science and Technology
computer visionmedical image analysisfederated learningmulti-modal learning
B
Bo Yao
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
Ruichen Rong
Ruichen Rong
UTSouthwestern Medical Center
Deep learning. Biomedical Imaging. NLP
Xiaowei Zhan
Xiaowei Zhan
Professor of Materials Science, Peking University
Polymer ChemistryOrganic Electronics
Guanghua Xiao
Guanghua Xiao
UT Southwestern Medical Center
Artificial intelligenceMachine learningMedical image analysisTissue imaging
E
Eric D. Peterson
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390; Department of Internal Medicine, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
Donghan M. Yang
Donghan M. Yang
Quantitative Biomedical Research Center, Peter O'Donnell School of Public Health, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, USA 75390
Yang Xie
Yang Xie
Professor, UT Southwestern Medical Center
Statistical GenomicsPredictive ModelingPrecision Medicine