Accelerating Social Science Research via Agentic Hypothesization and Experimentation

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Social science research has long been constrained by an inefficient “observe–hypothesize–test” cycle and a lack of automated discovery mechanisms. This work proposes EXPERIGEN, a novel framework that establishes the first end-to-end closed loop for hypothesis generation and validation in the social sciences. By orchestrating a generator–experimenter dual-agent system, EXPERIGEN integrates large language models, Bayesian optimization principles, statistical testing, and real-world A/B experiments, supporting both multimodal and relational data. Empirical evaluations demonstrate that the framework generates 2–4 times more hypotheses than baseline methods, with predictive performance improvements of 7%–17%. Expert assessments reveal that 88% of generated hypotheses are novel and 70% possess substantive research value. A/B test results confirm high statistical significance (p < 1e-6) and a large effect size (344%).

Technology Category

Application Category

📝 Abstract
Data-driven social science research is inherently slow, relying on iterative cycles of observation, hypothesis generation, and experimental validation. While recent data-driven methods promise to accelerate parts of this process, they largely fail to support end-to-end scientific discovery. To address this gap, we introduce EXPERIGEN, an agentic framework that operationalizes end-to-end discovery through a Bayesian optimization inspired two-phase search, in which a Generator proposes candidate hypotheses and an Experimenter evaluates them empirically. Across multiple domains, EXPERIGEN consistently discovers 2-4x more statistically significant hypotheses that are 7-17 percent more predictive than prior approaches, and naturally extends to complex data regimes including multimodal and relational datasets. Beyond statistical performance, hypotheses must be novel, empirically grounded, and actionable to drive real scientific progress. To evaluate these qualities, we conduct an expert review of machine-generated hypotheses, collecting feedback from senior faculty. Among 25 reviewed hypotheses, 88 percent were rated moderately or strongly novel, 70 percent were deemed impactful and worth pursuing, and most demonstrated rigor comparable to senior graduate-level research. Finally, recognizing that ultimate validation requires real-world evidence, we conduct the first A/B test of LLM-generated hypotheses, observing statistically significant results with p less than 1e-6 and a large effect size of 344 percent.
Problem

Research questions and friction points this paper is trying to address.

social science research
hypothesis generation
scientific discovery
data-driven methods
experimental validation
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic framework
Bayesian optimization
hypothesis generation
end-to-end scientific discovery
LLM-generated hypotheses
🔎 Similar Papers
No similar papers found.
J
Jishu Sen Gupta
Adobe Media and Data Science Research (MDSR), IIIT-Delhi
H
Harini SI
Adobe Media and Data Science Research (MDSR), IIIT-Delhi
S
Somesh Kumar Singh
Adobe Media and Data Science Research (MDSR), IIIT-Delhi
S
Syed Mohamad Tawseeq
Adobe Media and Data Science Research (MDSR), IIIT-Delhi
Yaman Kumar Singla
Yaman Kumar Singla
Adobe
Machine LearningBehavioral ScienceComputational MarketingLarge Language Models
David Doermann
David Doermann
Professor, University at Buffalo
Artificial IntelligenceDocument AnalysisComputer VisionMedia ForensicsLanguage Technologies
R
Rajiv Ratn Shah
Adobe Media and Data Science Research (MDSR), IIIT-Delhi
Balaji Krishnamurthy
Balaji Krishnamurthy
Adobe Inc
Computer VisionMachine Learning