LLM, Reporting In! Medical Information Extraction Across Prompting, Fine-tuning and Post-correction

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

178K/year
🤖 AI Summary
Named entity recognition (NER) and health event extraction in the low-resource French biomedical domain suffer from severe annotation scarcity. Method: This paper proposes an LLM-driven collaborative framework integrating automatic exemplar selection, annotation guideline summarization injected into prompts, synthetic data-augmented fine-tuning (leveraging GLiNER and LLaMA-3.1-8B-Instruct), and LLM-based post-validation. Contribution/Results: The approach tightly couples structured domain knowledge (guidelines), high-quality synthetic data, and multi-stage LLM reasoning to substantially alleviate annotation bottlenecks. Under extreme few-shot settings, GPT-4.1 achieves 61.53% macro-F1 on NER and 15.02% F1 on health event extraction via in-context learning—demonstrating the efficacy of synergistic prompt engineering and post-processing. The framework establishes a reusable methodological paradigm for information extraction in low-resource specialized domains.

Technology Category

Application Category

📝 Abstract
This work presents our participation in the EvalLLM 2025 challenge on biomedical Named Entity Recognition (NER) and health event extraction in French (few-shot setting). For NER, we propose three approaches combining large language models (LLMs), annotation guidelines, synthetic data, and post-processing: (1) in-context learning (ICL) with GPT-4.1, incorporating automatic selection of 10 examples and a summary of the annotation guidelines into the prompt, (2) the universal NER system GLiNER, fine-tuned on a synthetic corpus and then verified by an LLM in post-processing, and (3) the open LLM LLaMA-3.1-8B-Instruct, fine-tuned on the same synthetic corpus. Event extraction uses the same ICL strategy with GPT-4.1, reusing the guideline summary in the prompt. Results show GPT-4.1 leads with a macro-F1 of 61.53% for NER and 15.02% for event extraction, highlighting the importance of well-crafted prompting to maximize performance in very low-resource scenarios.
Problem

Research questions and friction points this paper is trying to address.

Extracting medical entities from French biomedical text using LLMs
Developing few-shot NER methods combining prompting and fine-tuning
Improving health event extraction performance in low-resource settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context learning with GPT-4.1 and guidelines
GLiNER fine-tuned on synthetic corpus with LLM verification
LLaMA-3.1 fine-tuned on synthetic corpus for NER
🔎 Similar Papers
No similar papers found.