A Role-Based LLM Framework for Structured Information Extraction from Healthy Food Policies

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenges large language models face when processing structurally diverse and inconsistent health food policy texts, which often lead to hallucinations, misclassification, and information omission. To mitigate these issues, the authors propose a novel role-based large language model framework that introduces multiple expert personas—namely, a policy analyst, a legal strategist, and a food systems specialist—into the policy information extraction pipeline for the first time. By integrating explicit domain knowledge with role-specific prompt engineering, the approach enhances both the accuracy and interpretability of complex reasoning. Evaluated on 608 policy documents using the Llama-3.3-70B model under zero-shot, few-shot, and chain-of-thought settings, the method significantly outperforms baseline approaches, achieving high-precision and transparent information extraction.
📝 Abstract
Current Large Language Model (LLM) approaches for information extraction (IE) in the healthy food policy domain are often hindered by various factors, including misinformation, specifically hallucinations, misclassifications, and omissions that result from the structural diversity and inconsistency of policy documents. To address these limitations, this study proposes a role-based LLM framework that automates the IE from unstructured policy data by assigning specialized roles: an LLM policy analyst for metadata and mechanism classification, an LLM legal strategy specialist for identifying complex legal approaches, and an LLM food system expert for categorizing food system stages. This framework mimics expert analysis workflows by incorporating structured domain knowledge, including explicit definitions of legal mechanisms and classification criteria, into role-specific prompts. We evaluate the framework using 608 healthy food policies from the Healthy Food Policy Project (HFPP) database, comparing its performance against zero-shot, few-shot, and chain-of-thought (CoT) baselines using Llama-3.3-70B. Our proposed framework demonstrates superior performance in complex reasoning tasks, offering a reliable and transparent methodology for automating IE from health policies.
Problem

Research questions and friction points this paper is trying to address.

information extraction
healthy food policy
hallucination
structural inconsistency
misclassification
Innovation

Methods, ideas, or system contributions that make the work stand out.

role-based LLM
structured information extraction
healthy food policy
domain-specific prompting
expert workflow simulation
🔎 Similar Papers
No similar papers found.
C
Congjing Zhang
Department of Industrial & Systems Engineering, University of Washington, Seattle, WA, USA
R
Ruoxuan Bao
Department of Management, Shanghai University, Shanghai, China
J
Jingyu Li
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Y
Yoav Ackerman
Department of Industrial & Systems Engineering, University of Washington, Seattle, WA, USA
Shuai Huang
Shuai Huang
University of Washington
Statistical Modeling and AnalysisMachine LearningHealthcareManufacturing
Yanfang Su
Yanfang Su
Assistant Professor, Lingnan University