Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning

📅 2024-10-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual challenges of scarce labeled data and highly diverse natural language queries in clinical ECG interpretation, this paper proposes a few-shot multimodal meta-learning framework. Methodologically, it introduces a novel LLM-agnostic ECG–language fusion paradigm: large language models (e.g., LLaMA, Gemma) and a pretrained ECG time-series encoder are frozen; a learnable cross-modal fusion module enables semantic alignment between ECG signals and textual queries; and a meta-learning strategy enhances generalization across heterogeneous diagnostic tasks. Evaluated under a 5-way 5-shot setting, the framework achieves accuracies of 84.6%, 77.3%, and 69.6% on verification-, selection-, and query-type questions, respectively—substantially outperforming supervised baselines. Notably, it operates effectively with single-lead ECG inputs, offering a new paradigm for interpretable, resource-efficient ECG diagnosis in low-data clinical settings.

Technology Category

Application Category

📝 Abstract
Electrocardiogram (ECG) interpretation requires specialized expertise, often involving synthesizing insights from ECG signals with complex clinical queries posed in natural language. The scarcity of labeled ECG data coupled with the diverse nature of clinical inquiries presents a significant challenge for developing robust and adaptable ECG diagnostic systems. This work introduces a novel multimodal meta-learning method for few-shot ECG question answering, addressing the challenge of limited labeled data while leveraging the rich knowledge encoded within large language models (LLMs). Our LLM-agnostic approach integrates a pre-trained ECG encoder with a frozen LLM (e.g., LLaMA and Gemma) via a trainable fusion module, enabling the language model to reason about ECG data and generate clinically meaningful answers. Extensive experiments demonstrate superior generalization to unseen diagnostic tasks compared to supervised baselines, achieving notable performance even with limited ECG leads. For instance, in a 5-way 5-shot setting, our method using LLaMA-3.1-8B achieves an accuracy of 84.6%, 77.3%, and 69.6% on single verify, choose and query question types, respectively. These results highlight the potential of our method to enhance clinical ECG interpretation by combining signal processing with the nuanced language understanding capabilities of LLMs, particularly in data-constrained scenarios.
Problem

Research questions and friction points this paper is trying to address.

Develops ECG-LM for few-shot QA with meta-learning
Addresses limited labeled ECG data for diagnostics
Integrates ECG encoder with LLMs for clinical reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal meta-learning for ECG question answering
LLM-agnostic fusion of ECG encoder with frozen LLM
Few-shot generalization outperforms supervised baselines
🔎 Similar Papers
No similar papers found.