TusoAI: Agentic Optimization for Scientific Methods

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scientific discovery is often hindered by lengthy computational tool development cycles and reliance on manual trial-and-error. To address this, we propose an intelligent agent AI system that enables end-to-end generation and optimization of task-specific scientific methodologies. Our approach integrates large language models with a structured domain knowledge graph, a candidate solution pool, and an automated evaluation–iteration optimization framework. The system autonomously performs modeling, diagnostic analysis, and iterative refinement. Evaluated on single-cell RNA-seq denoising and Earth remote sensing monitoring, it significantly outperforms both expert-designed methods and state-of-the-art AI agents. Empirical validation uncovered nine novel associations between autoimmune diseases and T-cell subtypes, as well as seven previously unknown regulatory relationships between pathogenic variants and their target genes—demonstrating its capacity to drive hypothesis generation and mechanistic discovery in biomedical research.

Technology Category

Application Category

📝 Abstract
Scientific discovery is often slowed by the manual development of computational tools needed to analyze complex experimental data. Building such tools is costly and time-consuming because scientists must iteratively review literature, test modeling and scientific assumptions against empirical data, and implement these insights into efficient software. Large language models (LLMs) have demonstrated strong capabilities in synthesizing literature, reasoning with empirical data, and generating domain-specific code, offering new opportunities to accelerate computational method development. Existing LLM-based systems either focus on performing scientific analyses using existing computational methods or on developing computational methods or models for general machine learning without effectively integrating the often unstructured knowledge specific to scientific domains. Here, we introduce TusoAI , an agentic AI system that takes a scientific task description with an evaluation function and autonomously develops and optimizes computational methods for the application. TusoAI integrates domain knowledge into a knowledge tree representation and performs iterative, domain-specific optimization and model diagnosis, improving performance over a pool of candidate solutions. We conducted comprehensive benchmark evaluations demonstrating that TusoAI outperforms state-of-the-art expert methods, MLE agents, and scientific AI agents across diverse tasks, such as single-cell RNA-seq data denoising and satellite-based earth monitoring. Applying TusoAI to two key open problems in genetics improved existing computational methods and uncovered novel biology, including 9 new associations between autoimmune diseases and T cell subtypes and 7 previously unreported links between disease variants linked to their target genes. Our code is publicly available at https://github.com/Alistair-Turcan/TusoAI.
Problem

Research questions and friction points this paper is trying to address.

Automates computational tool development for scientific data analysis
Integrates domain knowledge into iterative method optimization
Improves performance on genetics and earth monitoring tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic AI autonomously develops scientific computational methods
Integrates domain knowledge into tree representation for optimization
Iteratively optimizes candidate solutions with model diagnosis
🔎 Similar Papers
No similar papers found.
A
Alistair Turcan
Carnegie Mellon University
K
Kexin Huang
Stanford University
L
Lei Li
Carnegie Mellon University
Martin Jinye Zhang
Martin Jinye Zhang
Assistant Professor, Computational Biology Department, Carnegie Mellon University
Statistical geneticsSingle-cell RNA-seqStatisticsMachine learning