LLM, Reporting In! Medical Information Extraction Across Prompting, Fine-tuning and Post-correction

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Named entity recognition (NER) and health event extraction in the low-resource French biomedical domain suffer from severe annotation scarcity. Method: This paper proposes an LLM-driven collaborative framework integrating automatic exemplar selection, annotation guideline summarization injected into prompts, synthetic data-augmented fine-tuning (leveraging GLiNER and LLaMA-3.1-8B-Instruct), and LLM-based post-validation. Contribution/Results: The approach tightly couples structured domain knowledge (guidelines), high-quality synthetic data, and multi-stage LLM reasoning to substantially alleviate annotation bottlenecks. Under extreme few-shot settings, GPT-4.1 achieves 61.53% macro-F1 on NER and 15.02% F1 on health event extraction via in-context learning—demonstrating the efficacy of synergistic prompt engineering and post-processing. The framework establishes a reusable methodological paradigm for information extraction in low-resource specialized domains.

Technology Category

Application Category

📝 Abstract

This work presents our participation in the EvalLLM 2025 challenge on biomedical Named Entity Recognition (NER) and health event extraction in French (few-shot setting). For NER, we propose three approaches combining large language models (LLMs), annotation guidelines, synthetic data, and post-processing: (1) in-context learning (ICL) with GPT-4.1, incorporating automatic selection of 10 examples and a summary of the annotation guidelines into the prompt, (2) the universal NER system GLiNER, fine-tuned on a synthetic corpus and then verified by an LLM in post-processing, and (3) the open LLM LLaMA-3.1-8B-Instruct, fine-tuned on the same synthetic corpus. Event extraction uses the same ICL strategy with GPT-4.1, reusing the guideline summary in the prompt. Results show GPT-4.1 leads with a macro-F1 of 61.53% for NER and 15.02% for event extraction, highlighting the importance of well-crafted prompting to maximize performance in very low-resource scenarios.

Problem

Research questions and friction points this paper is trying to address.

Extracting medical entities from French biomedical text using LLMs

Developing few-shot NER methods combining prompting and fine-tuning

Improving health event extraction performance in low-resource settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context learning with GPT-4.1 and guidelines

GLiNER fine-tuned on synthetic corpus with LLM verification

LLaMA-3.1 fine-tuned on synthetic corpus for NER

🔎 Similar Papers

No similar papers found.