Enhancing Spatial Reasoning in Large Language Models for Metal-Organic Frameworks Structure Prediction

πŸ“… 2026-01-14
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF

career value

225K/year
πŸ€– AI Summary
Accurately predicting the three-dimensional structures of metal–organic frameworks (MOFs) remains highly challenging due to their intricate atomic arrangements, which existing large language models struggle to represent effectively. This work proposes MOF-LLM, the first large language model framework tailored for modular MOF structure prediction, built upon Qwen-3 8B and enhanced with a block-level generation paradigm to improve spatial reasoning. The approach integrates spatial priors, continual pretraining (CPT), structure-supervised fine-tuning (SFT), and match-driven reinforcement learning, complemented by a novel Soft Adaptive Policy Optimization strategy to enhance structural stability. Experimental results demonstrate that MOF-LLM significantly outperforms current state-of-the-art denoising and language-model-based methods in both prediction accuracy and sampling efficiency.

Technology Category

Application Category

πŸ“ Abstract
Metal-organic frameworks (MOFs) are porous crystalline materials with broad applications such as carbon capture and drug delivery, yet accurately predicting their 3D structures remains a significant challenge. While Large Language Models (LLMs) have shown promise in generating crystals, their application to MOFs is hindered by MOFs'high atomic complexity. Inspired by the success of block-wise paradigms in deep generative models, we pioneer the use of LLMs in this domain by introducing MOF-LLM, the first LLM framework specifically adapted for block-level MOF structure prediction. To effectively harness LLMs for this modular assembly task, our training paradigm integrates spatial-aware continual pre-training (CPT), structural supervised fine-tuning (SFT), and matching-driven reinforcement learning (RL). By incorporating explicit spatial priors and optimizing structural stability via Soft Adaptive Policy Optimization (SAPO), our approach substantially enhances the spatial reasoning capability of a Qwen-3 8B model for accurate MOF structure prediction. Comprehensive experiments demonstrate that MOF-LLM outperforms state-of-the-art denoising-based and LLM-based methods while exhibiting superior sampling efficiency.
Problem

Research questions and friction points this paper is trying to address.

Metal-Organic Frameworks
Structure Prediction
Spatial Reasoning
Large Language Models
3D Structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

MOF-LLM
spatial-aware continual pre-training
block-level structure prediction
Soft Adaptive Policy Optimization
large language models for crystal generation
πŸ”Ž Similar Papers
M
Mianzhi Pan
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
J
Jianfei Li
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
P
Peishuo Liu
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
B
Botian Wang
Institute of AI Industry Research (AIR), Tsinghua University
Y
Yawen Ouyang
Shanghai AI Lab
Y
Yiming Rong
University of Chinese Academy of Sciences, Beijing
H
Hao Zhou
Institute of AI Industry Research (AIR), Tsinghua University
Jianbing Zhang
Jianbing Zhang
Associate Professor, Nanjing University
pre-training modelmulti-modalimage captioningnatural language processingdata mining