MADE: Benchmark Environments for Closed-Loop Materials Discovery

📅 2026-01-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing benchmarks for materials discovery often focus on static prediction or isolated subtasks, failing to capture the iterative and adaptive nature of scientific discovery. To address this limitation, this work proposes MADE—a novel framework that establishes, for the first time, a composable and extensible closed-loop benchmark environment for materials discovery. The framework models the discovery process as a search for thermodynamically stable compounds under convex hull constraints. MADE integrates generative models, filters, and planners to support diverse workflows, ranging from fixed pipelines to embodied agents, and provides capabilities for reinforcement learning–based closed-loop search, thermodynamic stability evaluation, and convex hull relative energy computation. Experiments demonstrate that MADE effectively differentiates strategies in terms of discovery efficiency and success rate, highlighting its flexibility and practical utility in designing and optimizing autonomous materials discovery systems.

Technology Category

Application Category

📝 Abstract
Existing benchmarks for computational materials discovery primarily evaluate static predictive tasks or isolated computational sub-tasks. While valuable, these evaluations neglect the inherently iterative and adaptive nature of scientific discovery. We introduce MAterials Discovery Environments (MADE), a novel framework for benchmarking end-to-end autonomous materials discovery pipelines. MADE simulates closed-loop discovery campaigns in which an agent or algorithm proposes, evaluates, and refines candidate materials under a constrained oracle budget, capturing the sequential and resource-limited nature of real discovery workflows. We formalize discovery as a search for thermodynamically stable compounds relative to a given convex hull, and evaluate efficacy and efficiency via comparison to baseline algorithms. The framework is flexible; users can compose discovery agents from interchangeable components such as generative models, filters, and planners, enabling the study of arbitrary workflows ranging from fixed pipelines to fully agentic systems with tool use and adaptive decision making. We demonstrate this by conducting systematic experiments across a family of systems, enabling ablation of components in discovery pipelines, and comparison of how methods scale with system complexity.
Problem

Research questions and friction points this paper is trying to address.

materials discovery
benchmarking
closed-loop
autonomous discovery
iterative optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

closed-loop discovery
autonomous materials discovery
benchmark framework
sequential decision making
modular agent design
🔎 Similar Papers
No similar papers found.
S
Shreshth A. Malik
OATML, Department of Computer Science, University of Oxford; Diffractive Labs
T
Tiarnan Doherty
Diffractive Labs
P
P. Tigas
Diffractive Labs
M
Muhammed Razzak
Diffractive Labs
S
Stephen J. Roberts
Machine Learning Research Group, Department of Engineering Science, University of Oxford
Aron Walsh
Aron Walsh
Department of Materials, Imperial College London
Materials DesignSolid-State ChemistryAI for ScienceSolar Energy
Yarin Gal
Yarin Gal
Professor of Machine Learning, University of Oxford
Machine LearningArtificial IntelligenceProbability TheoryStatistics