LLM, Reporting In! Medical Information Extraction Across Prompting, Fine-tuning and Post-correction

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Named entity recognition (NER) and health event extraction in the low-resource French biomedical domain suffer from severe annotation scarcity. Method: This paper proposes an LLM-driven collaborative framework integrating automatic exemplar selection, annotation guideline summarization injected into prompts, synthetic data-augmented fine-tuning (leveraging GLiNER and LLaMA-3.1-8B-Instruct), and LLM-based post-validation. Contribution/Results: The approach tightly couples structured domain knowledge (guidelines), high-quality synthetic data, and multi-stage LLM reasoning to substantially alleviate annotation bottlenecks. Under extreme few-shot settings, GPT-4.1 achieves 61.53% macro-F1 on NER and 15.02% F1 on health event extraction via in-context learning—demonstrating the efficacy of synergistic prompt engineering and post-processing. The framework establishes a reusable methodological paradigm for information extraction in low-resource specialized domains.

Technology Category

Application Category

📝 Abstract
This work presents our participation in the EvalLLM 2025 challenge on biomedical Named Entity Recognition (NER) and health event extraction in French (few-shot setting). For NER, we propose three approaches combining large language models (LLMs), annotation guidelines, synthetic data, and post-processing: (1) in-context learning (ICL) with GPT-4.1, incorporating automatic selection of 10 examples and a summary of the annotation guidelines into the prompt, (2) the universal NER system GLiNER, fine-tuned on a synthetic corpus and then verified by an LLM in post-processing, and (3) the open LLM LLaMA-3.1-8B-Instruct, fine-tuned on the same synthetic corpus. Event extraction uses the same ICL strategy with GPT-4.1, reusing the guideline summary in the prompt. Results show GPT-4.1 leads with a macro-F1 of 61.53% for NER and 15.02% for event extraction, highlighting the importance of well-crafted prompting to maximize performance in very low-resource scenarios.
Problem

Research questions and friction points this paper is trying to address.

Extracting medical entities from French biomedical text using LLMs
Developing few-shot NER methods combining prompting and fine-tuning
Improving health event extraction performance in low-resource settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context learning with GPT-4.1 and guidelines
GLiNER fine-tuned on synthetic corpus with LLM verification
LLaMA-3.1 fine-tuned on synthetic corpus for NER
🔎 Similar Papers
No similar papers found.
I
Ikram Belmadani
Aix-Marseille Université, CNRS, LIS UMR 7020, 13000, Marseille, France
P
Parisa Nazari Hashemi
Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
T
Thomas Sebbag
Explore, Carquefou, France
Benoit Favre
Benoit Favre
Professeur CNU 27, LIS UMR 7020, Aix-Marseille University
Natural Language ProcessingSpoken Language UnderstandingParsingMachine Learning
G
Guillaume Fortier
Inetum, 93400 Saint-Ouen-sur-Seine, France
Solen Quiniou
Solen Quiniou
Nantes Université - LS2N
Natural language processingdata mininghandwriting recognitionhuman-computer interaction
E
Emmanuel Morin
Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
Richard Dufour
Richard Dufour
LS2N - TALN/NLP research group - Nantes University
Natural language processingBiomedical domainLanguage modelingSpontaneous speech