Reshaping MOFs Text Mining with a Dynamic Multi-Agent Framework of Large Language Agents

📅 2025-04-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Precise extraction of metal–organic framework (MOF) synthesis conditions from vast scientific literature remains challenging due to heterogeneity, ambiguity, and lack of structured representation. Method: We propose MOFh6, a dynamic multi-agent framework powered by GPT-4o-mini that integrates multimodal inputs—including literature text, MOF structural encodings, and physicochemical properties—and orchestrates domain-specialized agents for synthesis planning, property prediction, and chemical reasoning. The system leverages a curated MOF knowledge graph and advanced prompt engineering to enable natural-language-driven, end-to-end synthesis pathway recommendation and automated generation of DFT-ready structural files (e.g., CIF). Contribution/Results: MOFh6 introduces the first dynamic multi-agent architecture for MOF synthesis informatics, significantly improving retrieval accuracy and efficiency of synthesis conditions. It supports heterogeneous query formats and achieves full automation—from unstructured textual descriptions to computationally executable models—marking the first such capability in MOF computational design.

Technology Category

Application Category

📝 Abstract
The mining of synthesis conditions for metal-organic frameworks (MOFs) is a significant focus in materials science. However, identifying the precise synthesis conditions for specific MOFs within the vast array of possibilities presents a considerable challenge. Large Language Models (LLMs) offer a promising solution to this problem. We leveraged the capabilities of LLMs, specifically gpt-4o-mini, as core agents to integrate various MOF-related agents, including synthesis, attribute, and chemical information agents. This integration culminated in the development of MOFh6, an LLM tool designed to streamline the MOF synthesis process. MOFh6 allows users to query in multiple formats, such as submitting scientific literature, or inquiring about specific MOF codes or structural properties. The tool analyzes these queries to provide optimal synthesis conditions and generates model files for density functional theory pre modeling. We believe MOFh6 will enhance efficiency in the MOF synthesis of all researchers.
Problem

Research questions and friction points this paper is trying to address.

Identifying precise MOF synthesis conditions from vast possibilities
Integrating MOF-related agents using LLMs for synthesis optimization
Streamlining MOF synthesis queries and DFT pre-modeling file generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic multi-agent framework with LLMs
Integration of MOF-related specialized agents
Automated synthesis condition analysis and modeling
Z
Zuhong Lin
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
D
Daoyuan Ren
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, PR China
K
Kai Ran
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
S
Sun Jing
Faculty of Data Science, City University of Macau, 999078, Macao SAR, PR China
X
Xiaotiang Huang
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
H
Haiyang He
State Key Joint Laboratory of Environment Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing, 100084, PR China
P
Pengxu Pan
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
X
Xiaohang Zhang
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
Ying Fang
Ying Fang
Westlake University; Zhejiang University
speech recognition
Tianying Wang
Tianying Wang
Colorado State University
high-dimensional data analysismeasurement errorquantile regressiongenetics analysisstatistical genetics
M
Minli Wu
School of Computer Science and Engineering, Central South University, Changsha, 410083, PR China
Z
Zhanglin Li
School of Computer Science and Engineering, Central South University, Changsha, 410083, PR China
X
Xiaochuan Zhang
School of Software Engineering, Sun Yat-sen University, Zhuhai 519000, PR China
H
Haipu Li
Center for Environment and Water Resources, College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR China; Key Laboratory of Hunan Province for Water Environment and Agriculture Product Safety, Changsha, 410083, PR China
Jingjing Yao
Jingjing Yao
Assistant Professor, Dept of Computer Science, Texas Tech University
Internet of ThingsFederated learningUAV/drone assisted networksDeep reinforcement learning