AstroAgents: A Multi-Agent AI for Hypothesis Generation from Mass Spectrometry Data

📅 2025-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generating scientifically plausible hypotheses regarding the origins of life from extraterrestrial mass spectrometry data remains challenging due to environmental noise, complex peak deconvolution, and inefficient cross-study spectral matching. Method: We propose the first large language model (LLM)-based multi-agent system featuring eight specialized, collaboratively reasoning agents. The system integrates domain-expert role division, dynamic literature retrieval (via Semantic Scholar), mass spectral peak interpretation, and cross-sample matching, establishing a closed-loop workflow: hypothesis generation → deduplication → critical evaluation → literature grounding. Results: Applied to 8 meteorites and 10 soil samples, the system generated 127 hypotheses; blind expert evaluation confirmed 36% as scientifically reasonable, with 66% representing novel, literature-unreported proposals. This work establishes the first explainable, verifiable, and literature-anchored framework for automated origin-of-life hypothesis generation—introducing a new paradigm for astrobiology and in situ planetary analysis.

Technology Category

Application Category

📝 Abstract
With upcoming sample return missions across the solar system and the increasing availability of mass spectrometry data, there is an urgent need for methods that analyze such data within the context of existing astrobiology literature and generate plausible hypotheses regarding the emergence of life on Earth. Hypothesis generation from mass spectrometry data is challenging due to factors such as environmental contaminants, the complexity of spectral peaks, and difficulties in cross-matching these peaks with prior studies. To address these challenges, we introduce AstroAgents, a large language model-based, multi-agent AI system for hypothesis generation from mass spectrometry data. AstroAgents is structured around eight collaborative agents: a data analyst, a planner, three domain scientists, an accumulator, a literature reviewer, and a critic. The system processes mass spectrometry data alongside user-provided research papers. The data analyst interprets the data, and the planner delegates specific segments to the scientist agents for in-depth exploration. The accumulator then collects and deduplicates the generated hypotheses, and the literature reviewer identifies relevant literature using Semantic Scholar. The critic evaluates the hypotheses, offering rigorous suggestions for improvement. To assess AstroAgents, an astrobiology expert evaluated the novelty and plausibility of more than a hundred hypotheses generated from data obtained from eight meteorites and ten soil samples. Of these hypotheses, 36% were identified as plausible, and among those, 66% were novel. Project website: https://astroagents.github.io/
Problem

Research questions and friction points this paper is trying to address.

Analyze mass spectrometry data for astrobiology hypothesis generation
Overcome challenges like contaminants and spectral peak complexity
Generate novel and plausible hypotheses using multi-agent AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent AI system for hypothesis generation
LLM-based agents process spectrometry data collaboratively
Semantic Scholar integrates literature for hypothesis validation
D
Daniel Saeedi
Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
D
Denise Buckner
NASA Goddard Space Flight Center, Greenbelt, MD, USA
J
Jose C. Aponte
NASA Goddard Space Flight Center, Greenbelt, MD, USA
Amirali Aghazadeh
Amirali Aghazadeh
ECE, Georgia Tech
AIMachine LearningSignal ProcessingComputational BiologyMolecular Design