Bridging Literature and the Universe Via A Multi-Agent Large Language Model System

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Cosmological simulation parameter extraction suffers from heterogeneous literature formats, error-prone and inefficient manual conversion. To address this, we propose SimAgents—the first multi-agent large language model system tailored for astrophysics—integrating domain-specific physical reasoning, tool-augmented execution, and structured inter-agent communication to enable automated parameter extraction, cross-document consistency verification, and generation of executable simulation scripts. We introduce a novel benchmark dataset comprising over 40 real-world cosmological simulations and publicly release both the system and dataset. Experiments demonstrate that SimAgents significantly outperforms baseline methods in parameter extraction accuracy and script syntactic/semantic compliance. By bridging the gap between scientific literature and numerical simulation infrastructure, SimAgents enhances research efficiency and reproducibility in computational cosmology.

Technology Category

Application Category

📝 Abstract

As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.

Problem

Research questions and friction points this paper is trying to address.

Automate extraction of simulation parameters from dense literature

Translate parameters into executable scripts efficiently

Ensure parameters are physically meaningful and software-compliant

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM system automates parameter extraction

Agents ensure physics validation and software compliance

Public dataset with 40+ simulations for evaluation

🔎 Similar Papers

System for systematic literature review using multiple AI agents: Concept and an empirical evaluation