TusoAI: Agentic Optimization for Scientific Methods

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
Scientific discovery is often hindered by lengthy computational tool development cycles and reliance on manual trial-and-error. To address this, we propose an intelligent agent AI system that enables end-to-end generation and optimization of task-specific scientific methodologies. Our approach integrates large language models with a structured domain knowledge graph, a candidate solution pool, and an automated evaluation–iteration optimization framework. The system autonomously performs modeling, diagnostic analysis, and iterative refinement. Evaluated on single-cell RNA-seq denoising and Earth remote sensing monitoring, it significantly outperforms both expert-designed methods and state-of-the-art AI agents. Empirical validation uncovered nine novel associations between autoimmune diseases and T-cell subtypes, as well as seven previously unknown regulatory relationships between pathogenic variants and their target genes—demonstrating its capacity to drive hypothesis generation and mechanistic discovery in biomedical research.

Technology Category

Application Category

📝 Abstract
Scientific discovery is often slowed by the manual development of computational tools needed to analyze complex experimental data. Building such tools is costly and time-consuming because scientists must iteratively review literature, test modeling and scientific assumptions against empirical data, and implement these insights into efficient software. Large language models (LLMs) have demonstrated strong capabilities in synthesizing literature, reasoning with empirical data, and generating domain-specific code, offering new opportunities to accelerate computational method development. Existing LLM-based systems either focus on performing scientific analyses using existing computational methods or on developing computational methods or models for general machine learning without effectively integrating the often unstructured knowledge specific to scientific domains. Here, we introduce TusoAI , an agentic AI system that takes a scientific task description with an evaluation function and autonomously develops and optimizes computational methods for the application. TusoAI integrates domain knowledge into a knowledge tree representation and performs iterative, domain-specific optimization and model diagnosis, improving performance over a pool of candidate solutions. We conducted comprehensive benchmark evaluations demonstrating that TusoAI outperforms state-of-the-art expert methods, MLE agents, and scientific AI agents across diverse tasks, such as single-cell RNA-seq data denoising and satellite-based earth monitoring. Applying TusoAI to two key open problems in genetics improved existing computational methods and uncovered novel biology, including 9 new associations between autoimmune diseases and T cell subtypes and 7 previously unreported links between disease variants linked to their target genes. Our code is publicly available at https://github.com/Alistair-Turcan/TusoAI.
Problem

Research questions and friction points this paper is trying to address.

Automates computational tool development for scientific data analysis
Integrates domain knowledge into iterative method optimization
Improves performance on genetics and earth monitoring tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic AI autonomously develops scientific computational methods
Integrates domain knowledge into tree representation for optimization
Iteratively optimizes candidate solutions with model diagnosis
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
A
Alistair Turcan
Carnegie Mellon University
K
Kexin Huang
Stanford University
L
Lei Li
Carnegie Mellon University
Martin Jinye Zhang
Martin Jinye Zhang
Assistant Professor, Computational Biology Department, Carnegie Mellon University
Statistical geneticsSingle-cell RNA-seqStatisticsMachine learning