MOFGPT: Generative Design of Metal-Organic Frameworks using Language Models

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The vast structural space of metal–organic frameworks (MOFs) and the prohibitive computational cost of density functional theory (DFT) and molecular simulations hinder scalable inverse MOF design. Method: This work introduces the first chemistry-aware, string-based generative framework—MOFid—integrating a GPT-based language model, the MOFormer property predictor, and proximal policy optimization (PPO) reinforcement learning to enable end-to-end, property-guided, synthesizable MOF de novo design. Contribution/Results: We propose MOFid, a novel string encoding that unifies topological and connectivity information; this is the first systematic application of RL-enhanced generative language models to MOF inverse design. Experiments show >92% topological validity, high synthetic feasibility, and superior performance over random sampling and VAE baselines on target properties—including gas adsorption and electrical conductivity. A single inference yields hundreds of high-quality candidate structures.

Technology Category

Application Category

📝 Abstract
The discovery of Metal-Organic Frameworks (MOFs) with application-specific properties remains a central challenge in materials chemistry, owing to the immense size and complexity of their structural design space. Conventional computational screening techniques such as molecular simulations and density functional theory (DFT), while accurate, are computationally prohibitive at scale. Machine learning offers an exciting alternative by leveraging data-driven approaches to accelerate materials discovery. The complexity of MOFs, with their extended periodic structures and diverse topologies, creates both opportunities and challenges for generative modeling approaches. To address these challenges, we present a reinforcement learning-enhanced, transformer-based framework for the de novo design of MOFs. Central to our approach is MOFid, a chemically-informed string representation encoding both connectivity and topology, enabling scalable generative modeling. Our pipeline comprises three components: (1) a generative GPT model trained on MOFid sequences, (2) MOFormer, a transformer-based property predictor, and (3) a reinforcement learning (RL) module that optimizes generated candidates via property-guided reward functions. By integrating property feedback into sequence generation, our method drives the model toward synthesizable, topologically valid MOFs with desired functional attributes. This work demonstrates the potential of large language models, when coupled with reinforcement learning, to accelerate inverse design in reticular chemistry and unlock new frontiers in computational MOF discovery.
Problem

Research questions and friction points this paper is trying to address.

Generative design of MOFs with specific properties
Overcoming computational limitations in MOF discovery
Integrating property feedback for valid MOF synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning-enhanced transformer framework
MOFid string representation for generative modeling
Property-guided RL optimizes MOF candidates
🔎 Similar Papers
No similar papers found.
Srivathsan Badrinarayanan
Srivathsan Badrinarayanan
Researcher, Carnegie Mellon University
Chemical EngineeringMachine LearningAI4Science
Rishikesh Magar
Rishikesh Magar
Department of Mechanical Engineering, Carnegie Mellon University, 15213, USA
Akshay Antony
Akshay Antony
Department of Mechanical Engineering, Carnegie Mellon University, 15213, USA
R
Radheesh Sharma Meda
Department of Chemical Engineering, Carnegie Mellon University, 15213, USA
A
A. Farimani
Department of Chemical Engineering, Carnegie Mellon University, 15213, USA; Department of Mechanical Engineering, Carnegie Mellon University, 15213, USA; Department of Biomedical Engineering, Carnegie Mellon University, 15213, USA; Machine Learning Department, Carnegie Mellon University, 15213, USA