Adaptive Budget Allocation in LLM-Augmented Surveys

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the challenge that the reliability of large language models (LLMs) in generating survey responses varies across questions and is unknown a priori. The authors propose an adaptive budget allocation algorithm that operates without prior knowledge, simultaneously collecting human annotations while online-learning the difficulty of each question for the LLM. By dynamically prioritizing limited labeling resources toward questions where the LLM is least reliable, the method integrates online learning, LLM response prediction, and a dual-use mechanism leveraging human feedback, with theoretical convergence guarantees. Evaluated on real-world survey data, the approach reduces human annotation waste from 10–12% to 2–6% compared to uniform allocation, significantly decreasing the number of required human-labeled samples for equivalent estimation accuracy—an advantage that grows as heterogeneity in LLM performance across questions increases.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) can generate survey responses at low cost, but their reliability varies substantially across questions and is unknown before data collection. Deploying LLMs in surveys still requires costly human responses for verification and correction. How should a limited human-labeling budget be allocated across questions in real time? We propose an adaptive allocation algorithm that learns which questions are hardest for the LLM while simultaneously collecting human responses. Each human label serves a dual role: it improves the estimate for that question and reveals how well the LLM predicts human responses on it. The algorithm directs more budget to questions where the LLM is least reliable, without requiring any prior knowledge of question-level LLM accuracy. We prove that the allocation gap relative to the best possible allocation vanishes as the budget grows, and validate the approach on both synthetic data and a real survey dataset with 68 questions and over 2000 respondents. On real survey data, the standard practice of allocating human labels uniformly across questions wastes 10--12% of the budget relative to the optimal; our algorithm reduces this waste to 2--6%, and the advantage grows as questions become more heterogeneous in LLM prediction quality. The algorithm achieves the same estimation quality as traditional uniform sampling with fewer human samples, requires no pilot study, and is backed by formal performance guarantees validated on real survey data. More broadly, the framework applies whenever scarce human oversight must be allocated across tasks where LLM reliability is unknown.

Problem

Research questions and friction points this paper is trying to address.

adaptive budget allocation

LLM-augmented surveys

human labeling

reliability uncertainty

survey data collection

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive budget allocation

LLM-augmented surveys

human-in-the-loop