Enhancing Spatial Reasoning in Large Language Models for Metal-Organic Frameworks Structure Prediction

๐Ÿ“… 2026-01-14
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Accurately predicting the three-dimensional structures of metalโ€“organic frameworks (MOFs) remains highly challenging due to their intricate atomic arrangements, which existing large language models struggle to represent effectively. This work proposes MOF-LLM, the first large language model framework tailored for modular MOF structure prediction, built upon Qwen-3 8B and enhanced with a block-level generation paradigm to improve spatial reasoning. The approach integrates spatial priors, continual pretraining (CPT), structure-supervised fine-tuning (SFT), and match-driven reinforcement learning, complemented by a novel Soft Adaptive Policy Optimization strategy to enhance structural stability. Experimental results demonstrate that MOF-LLM significantly outperforms current state-of-the-art denoising and language-model-based methods in both prediction accuracy and sampling efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
Metal-organic frameworks (MOFs) are porous crystalline materials with broad applications such as carbon capture and drug delivery, yet accurately predicting their 3D structures remains a significant challenge. While Large Language Models (LLMs) have shown promise in generating crystals, their application to MOFs is hindered by MOFs'high atomic complexity. Inspired by the success of block-wise paradigms in deep generative models, we pioneer the use of LLMs in this domain by introducing MOF-LLM, the first LLM framework specifically adapted for block-level MOF structure prediction. To effectively harness LLMs for this modular assembly task, our training paradigm integrates spatial-aware continual pre-training (CPT), structural supervised fine-tuning (SFT), and matching-driven reinforcement learning (RL). By incorporating explicit spatial priors and optimizing structural stability via Soft Adaptive Policy Optimization (SAPO), our approach substantially enhances the spatial reasoning capability of a Qwen-3 8B model for accurate MOF structure prediction. Comprehensive experiments demonstrate that MOF-LLM outperforms state-of-the-art denoising-based and LLM-based methods while exhibiting superior sampling efficiency.
Problem

Research questions and friction points this paper is trying to address.

Metal-Organic Frameworks
Structure Prediction
Spatial Reasoning
Large Language Models
3D Structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

MOF-LLM
spatial-aware continual pre-training
block-level structure prediction
Soft Adaptive Policy Optimization
large language models for crystal generation
๐Ÿ”Ž Similar Papers
No similar papers found.
M
Mianzhi Pan
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
J
Jianfei Li
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
P
Peishuo Liu
National Key Laboratory for Novel Software Technology & School of Artificial Intelligence, Nanjing University, China
B
Botian Wang
Institute of AI Industry Research (AIR), Tsinghua University
Y
Yawen Ouyang
Shanghai AI Lab
Y
Yiming Rong
University of Chinese Academy of Sciences, Beijing
H
Hao Zhou
Institute of AI Industry Research (AIR), Tsinghua University
Jianbing Zhang
Jianbing Zhang
Associate Professor, Nanjing University
pre-training modelmulti-modalimage captioningnatural language processingdata mining