HLER: Human-in-the-Loop Economic Research via Multi-Agent Pipelines for Empirical Discovery

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a multi-agent collaborative framework for AI-driven automation of empirical economic research that explicitly accounts for data constraints and identification strategies, addressing the common pitfall of generating infeasible or spurious hypotheses. The system integrates data auditing, econometric analysis, and paper-writing modules through a dual-loop architecture—comprising a problem-quality screening loop and a research revision loop—augmented by a data-aware hypothesis generation mechanism and human-in-the-loop decision gating to ensure critical expert oversight. Evaluated on three empirical datasets, the framework achieves an 87% feasibility rate in generating valid research questions, substantially outperforming the baseline (41%), while producing complete academic papers at an average API cost of only $0.8–$1.5, thereby significantly enhancing both the practical viability of automated research and the efficiency of human-AI collaboration.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have enabled agent-based systems that aim to automate scientific research workflows. Most existing approaches focus on fully autonomous discovery, where AI systems generate research ideas, conduct analyses, and produce manuscripts with minimal human involvement. However, empirical research in economics and the social sciences poses additional constraints: research questions must be grounded in available datasets, identification strategies require careful design, and human judgment remains essential for evaluating economic significance. We introduce HLER (Human-in-the-Loop Economic Research), a multi-agent architecture that supports empirical research automation while preserving critical human oversight. The system orchestrates specialized agents for data auditing, data profiling, hypothesis generation, econometric analysis, manuscript drafting, and automated review. A key design principle is dataset-aware hypothesis generation, where candidate research questions are constrained by dataset structure, variable availability, and distributional diagnostics, reducing infeasible or hallucinated hypotheses. HLER further implements a two-loop architecture: a question quality loop that screens and selects feasible hypotheses, and a research revision loop where automated review triggers re-analysis and manuscript revision. Human decision gates are embedded at key stages, allowing researchers to guide the automated pipeline. Experiments on three empirical datasets show that dataset-aware hypothesis generation produces feasible research questions in 87% of cases (versus 41% under unconstrained generation), while complete empirical manuscripts can be produced at an average API cost of $0.8-$1.5 per run. These results suggest that Human-AI collaborative pipelines may provide a practical path toward scalable empirical research.
Problem

Research questions and friction points this paper is trying to address.

empirical research
human-in-the-loop
economics
dataset constraints
hypothesis generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-in-the-loop
Dataset-aware hypothesis generation
Multi-agent pipeline
Empirical research automation
Econometric analysis