Can LLMs Help Allocate Public Health Resources? A Case Study on Childhood Lead Testing

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Public health agencies face practical challenges in identifying high-risk communities for childhood lead exposure and allocating resources equitably. This study pioneers the application of large language models (LLMs) to public health resource prioritization, introducing a multidimensional vulnerability scoring framework integrating the proportion of undetected children, elevated blood lead prevalence, and public health service coverage—empirically evaluated across Chicago, New York City, and Washington, D.C. Methodologically, it integrates LLM-based agent reasoning, epidemiological data interpretation, inter-regional comparative analysis, and resource allocation simulation. Results reveal that LLMs achieve only moderate average allocation accuracy (0.46; max 0.66), systematically underprioritizing high-risk areas (e.g., West Englewood, Chicago), attributable to retrieval bias, outdated training data, and non-evidence-based narrative interference. The study delineates both the promise and critical limitations of LLMs in structured, equity-oriented public health decision-making, providing essential empirical grounding for AI-augmented health justice initiatives.

Technology Category

Application Category

📝 Abstract

Public health agencies face critical challenges in identifying high-risk neighborhoods for childhood lead exposure with limited resources for outreach and intervention programs. To address this, we develop a Priority Score integrating untested children proportions, elevated blood lead prevalence, and public health coverage patterns to support optimized resource allocation decisions across 136 neighborhoods in Chicago, New York City, and Washington, D.C. We leverage these allocation tasks, which require integrating multiple vulnerability indicators and interpreting empirical evidence, to evaluate whether large language models (LLMs) with agentic reasoning and deep research capabilities can effectively allocate public health resources when presented with structured allocation scenarios. LLMs were tasked with distributing 1,000 test kits within each city based on neighborhood vulnerability indicators. Results reveal significant limitations: LLMs frequently overlooked neighborhoods with highest lead prevalence and largest proportions of untested children, such as West Englewood in Chicago, while allocating disproportionate resources to lower-priority areas like Hunts Point in New York City. Overall accuracy averaged 0.46, reaching a maximum of 0.66 with ChatGPT 5 Deep Research. Despite their marketed deep research capabilities, LLMs struggled with fundamental limitations in information retrieval and evidence-based reasoning, frequently citing outdated data and allowing non-empirical narratives about neighborhood conditions to override quantitative vulnerability indicators.

Problem

Research questions and friction points this paper is trying to address.

Optimizing public health resource allocation for childhood lead testing

Evaluating LLMs' capability to distribute resources using vulnerability indicators

Addressing limitations in LLM reasoning for evidence-based public health decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Priority Score integrates multiple vulnerability indicators for allocation

LLMs evaluated using structured public health resource allocation scenarios

Agentic reasoning tested for evidence-based public health decisions

🔎 Similar Papers

Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases