MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow

📅 2025-01-18

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Current high-throughput design of metal–organic frameworks (MOFs) suffers from decoupled generation and screening stages, alongside inefficient heterogeneous computing resource coordination. Method: We introduce the first open-source workflow integrating generative AI with multiscale physics-based simulations—combining diffusion models/VAEs for molecular generation, molecular dynamics, density functional theory (DFT), and grand-canonical Monte Carlo (GCMC) simulations. We further propose an online-learning-driven, CPU/GPU-adaptive scheduling framework to enable closed-loop generation–simulation optimization. Contribution/Results: We present a modular scientific AI architecture enabling cross-domain reusability. Deployed on a 450-node supercomputer, the system achieves thousand-GPU-scale AI–simulation co-execution, delivering industry-leading MOF generation throughput. The top-performing structures rank within the top 10% of the hMOF dataset for CO₂ adsorption capacity. Crucially, generation quality scales linearly with compute node count, demonstrating the feasibility of large-scale HPC-enabled generative AI for accelerated materials discovery.

Technology Category

Application Category

📝 Abstract

We present MOFA, an open-source generative AI (GenAI) plus simulation workflow for high-throughput generation of metal-organic frameworks (MOFs) on large-scale high-performance computing (HPC) systems. MOFA addresses key challenges in integrating GPU-accelerated computing for GPU-intensive GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screening and filtering AI-generated MOFs using molecular dynamics, density functional theory, and Monte Carlo simulations. These heterogeneous tasks are unified within an online learning framework that optimizes the utilization of available CPU and GPU resources across HPC systems. Performance metrics from a 450-node (14,400 AMD Zen 3 CPUs + 1800 NVIDIA A100 GPUs) supercomputer run demonstrate that MOFA achieves high-throughput generation of novel MOF structures, with CO$_2$ adsorption capacities ranking among the top 10 in the hypothetical MOF (hMOF) dataset. Furthermore, the production of high-quality MOFs exhibits a linear relationship with the number of nodes utilized. The modular architecture of MOFA will facilitate its integration into other scientific applications that dynamically combine GenAI with large-scale simulations.

Problem

Research questions and friction points this paper is trying to address.

CO2 Capture

Metal-Organic Frameworks (MOFs)

High-Performance Computing (HPC)

Innovation

Methods, ideas, or system contributions that make the work stand out.

Artificial Intelligence

High-Performance Computing

Metal-Organic Frameworks

🔎 Similar Papers

Genetic-guided GFlowNets for Sample Efficient Molecular Optimization

2024-02-05Citations: 1

Nvidia

$168,000 - $264,500 USD

US, CA, Santa Clara

AI Research Scientist — Agentic AI for Materials Discovery