Repurposing Annotation Guidelines to Instruct LLM Annotators: A Case Study

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that human-authored annotation guidelines are poorly suited for large language model (LLM)-based text annotation due to their informal, ambiguous, and context-dependent nature. We propose a guideline refactoring method oriented toward LLM auditing: automatically transforming natural-language guidelines into structured, semantically precise, instruction-style rules aligned with LLM comprehension preferences. Our approach preserves original semantic intent while systematically enhancing executability and robustness. Evaluated on disease entity recognition using the NCBI Disease Corpus, the refactored guidelines significantly improve LLM annotation accuracy and inter-annotator consistency, enabling automated iterative guideline refinement. Empirical analysis further identifies critical failure modes—including instruction ambiguity and insufficient coverage of edge cases. This study establishes a novel paradigm and reusable methodological framework for building high-quality, LLM-native annotation infrastructure.

Technology Category

Application Category

📝 Abstract
This study investigates how existing annotation guidelines can be repurposed to instruct large language model (LLM) annotators for text annotation tasks. Traditional guidelines are written for human annotators who internalize training, while LLMs require explicit, structured instructions. We propose a moderation-oriented guideline repurposing method that transforms guidelines into clear directives for LLMs through an LLM moderation process. Using the NCBI Disease Corpus as a case study, our experiments show that repurposed guidelines can effectively guide LLM annotators, while revealing several practical challenges. The results highlight the potential of this workflow to support scalable and cost-effective refinement of annotation guidelines and automated annotation.
Problem

Research questions and friction points this paper is trying to address.

Repurposing human annotation guidelines for LLM text annotation
Transforming guidelines into explicit instructions for language models
Addressing practical challenges in automated annotation workflow scaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Repurposing annotation guidelines for LLM instructions
Using moderation process to transform human guidelines
Enabling scalable automated annotation with structured directives
🔎 Similar Papers
No similar papers found.
K
Kon Woo Kim
The Graduate University for Advanced Studies, SOKENDAI; National Institute of Informatics, Chiyoda, Tokyo, Japan
Rezarta Islamaj
Rezarta Islamaj
National Library of Medicine, National Institutes of
Natural language processingtext miningmachine learningdata mining
J
Jin-Dong Kim
Joint Support-Center for Data Science Research, Japan
Florian Boudin
Florian Boudin
Associate Professor, LS2N - Nantes Université and JFLI - National Institute of Informatics / Tokyo
Natural Language ProcessingInformation RetrievalComputational Linguistics
A
Akiko Aizawa
The Graduate University for Advanced Studies, SOKENDAI; National Institute of Informatics, Chiyoda, Tokyo, Japan