🤖 AI Summary
This study addresses the challenges large language models face when processing structurally diverse and inconsistent health food policy texts, which often lead to hallucinations, misclassification, and information omission. To mitigate these issues, the authors propose a novel role-based large language model framework that introduces multiple expert personas—namely, a policy analyst, a legal strategist, and a food systems specialist—into the policy information extraction pipeline for the first time. By integrating explicit domain knowledge with role-specific prompt engineering, the approach enhances both the accuracy and interpretability of complex reasoning. Evaluated on 608 policy documents using the Llama-3.3-70B model under zero-shot, few-shot, and chain-of-thought settings, the method significantly outperforms baseline approaches, achieving high-precision and transparent information extraction.
📝 Abstract
Current Large Language Model (LLM) approaches for information extraction (IE) in the healthy food policy domain are often hindered by various factors, including misinformation, specifically hallucinations, misclassifications, and omissions that result from the structural diversity and inconsistency of policy documents. To address these limitations, this study proposes a role-based LLM framework that automates the IE from unstructured policy data by assigning specialized roles: an LLM policy analyst for metadata and mechanism classification, an LLM legal strategy specialist for identifying complex legal approaches, and an LLM food system expert for categorizing food system stages. This framework mimics expert analysis workflows by incorporating structured domain knowledge, including explicit definitions of legal mechanisms and classification criteria, into role-specific prompts. We evaluate the framework using 608 healthy food policies from the Healthy Food Policy Project (HFPP) database, comparing its performance against zero-shot, few-shot, and chain-of-thought (CoT) baselines using Llama-3.3-70B. Our proposed framework demonstrates superior performance in complex reasoning tasks, offering a reliable and transparent methodology for automating IE from health policies.