🤖 AI Summary
Current AI scientist systems suffer from limited personalization, rigid workflows, and fragmented integration of tools and data, hindering open, collaborative scientific discovery. To address this, we propose ToolUniverse—the first ecosystem for AI scientists supporting arbitrary language models and reasoning frameworks. It unifies over 600 machine learning models, datasets, APIs, and scientific tools. We introduce a standardized tool invocation framework enabling automatic interface refinement, natural-language-driven tool generation, composable workflow orchestration, and iterative optimization. Technically, ToolUniverse integrates multimodal NLP, automated tool interface modeling, and heterogeneous model coordination, ensuring compatibility with both open- and closed-source large language models. Evaluated in familial hypercholesterolemia research, it successfully instantiated an AI scientist capable of identifying high-potential drug analogues. The platform is fully open-sourced, fostering community-driven development and scalable deployment of scientific AI agents.
📝 Abstract
AI scientists are emerging computational systems that serve as collaborative partners in discovery. These systems remain difficult to build because they are bespoke, tied to rigid workflows, and lack shared environments that unify tools, data, and analyses into a common ecosystem. In omics, unified ecosystems have transformed research by enabling interoperability, reuse, and community-driven development; AI scientists require comparable infrastructure. We present ToolUniverse, an ecosystem for building AI scientists from any language or reasoning model, whether open or closed. TOOLUNIVERSE standardizes how AI scientists identify and call tools, integrating more than 600 machine learning models, datasets, APIs, and scientific packages for data analysis, knowledge retrieval, and experimental design. It automatically refines tool interfaces for correct use by AI scientists, creates new tools from natural language descriptions, iteratively optimizes tool specifications, and composes tools into agentic workflows. In a case study of hypercholesterolemia, ToolUniverse was used to create an AI scientist to identify a potent analog of a drug with favorable predicted properties. The open-source ToolUniverse is available at https://aiscientist.tools.