🤖 AI Summary
Epidemic modeling faces high barriers to entry, long development cycles, and substantial computational and expert labor costs. Method: We propose a multi-agent large language model (LLM) framework featuring a two-tier agent architecture: a scientist agent (GPT-4.1) orchestrates scientific reasoning and ensures methodological rigor, while task-specialist agents (GPT-4.1 mini) autonomously execute domain-specific subtasks—including literature synthesis, dynamical systems modeling, stochastic simulation, network science analysis, and visualization—to enable end-to-end research paper generation. Contribution/Results: Evaluated across diverse epidemic scenarios, the framework achieves 100% task success rate—significantly outperforming single-agent baselines in both efficiency and scientific validity. Each full-cycle study consumes an average of 870K tokens (~$1.57). Generated reports attain consistent validation from both AI evaluators and human domain experts, effectively lowering modeling expertise requirements, accelerating research timelines, and broadening access to advanced computational epidemiology tools.
📝 Abstract
Large Language Models (LLMs) offer new opportunities to automate complex interdisciplinary research domains. Epidemic modeling, characterized by its complexity and reliance on network science, dynamical systems, epidemiology, and stochastic simulations, represents a prime candidate for leveraging LLM-driven automation. We introduce extbf{EpidemIQs}, a novel multi-agent LLM framework that integrates user inputs and autonomously conducts literature review, analytical derivation, network modeling, mechanistic modeling, stochastic simulations, data visualization and analysis, and finally documentation of findings in a structured manuscript. We introduced two types of agents: a scientist agent for planning, coordination, reflection, and generation of final results, and a task-expert agent to focus exclusively on one specific duty serving as a tool to the scientist agent. The framework consistently generated complete reports in scientific article format. Specifically, using GPT 4.1 and GPT 4.1 mini as backbone LLMs for scientist and task-expert agents, respectively, the autonomous process completed with average total token usage 870K at a cost of about $1.57 per study, achieving a 100% completion success rate through our experiments. We evaluate EpidemIQs across different epidemic scenarios, measuring computational cost, completion success rate, and AI and human expert reviews of generated reports. We compare EpidemIQs to the single-agent LLM, which has the same system prompts and tools, iteratively planning, invoking tools, and revising outputs until task completion. The comparison shows consistently higher performance of the proposed framework across five different scenarios. EpidemIQs represents a step forward in accelerating scientific research by significantly reducing costs and turnaround time of discovery processes, and enhancing accessibility to advanced modeling tools.