AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A systematic review of AI applications in catalyst discovery—particularly regarding the enabling mechanisms of large language models (LLMs)—is currently lacking. Method: This work introduces the first unified AI taxonomy covering both homogeneous and heterogeneous catalysis, comprehensively mapping the technical evolution from classical machine learning (e.g., RF, SVM, GNN) to deep learning (CNN, RNN, Transformer) and LLMs (fine-tuning, prompt engineering, knowledge-augmented reasoning), while analyzing method-specific advantages and cross-disciplinary challenges. Contribution/Results: We propose a novel LLM-driven paradigm for reaction prediction, mechanistic hypothesis generation, and scientific literature comprehension; further releasing a structured literature corpus, benchmark datasets, and an open-source toolkit (GitHub). As the field’s first panoramic survey on LLM-empowered catalytic discovery, this work has been adopted for teaching and research at multiple universities and laboratories.

Technology Category

Application Category

📝 Abstract
Catalysts are essential for accelerating chemical reactions and enhancing selectivity, which is crucial for the sustainable production of energy, materials, and bioactive compounds. Catalyst discovery is fundamental yet challenging in computational chemistry and has garnered significant attention due to the promising performance of advanced Artificial Intelligence (AI) techniques. The development of Large Language Models (LLMs) notably accelerates progress in the discovery of both homogeneous and heterogeneous catalysts, where their chemical reactions differ significantly in material phases, temperature, dynamics, etc. However, there is currently no comprehensive survey that discusses the progress and latest developments in both areas, particularly with the application of LLM techniques. To address this gap, this paper presents a thorough and systematic survey of AI-empowered catalyst discovery, employing a unified and general categorization for homogeneous and heterogeneous catalysts. We examine the progress of AI-empowered catalyst discovery, highlighting their individual advantages and disadvantages, and discuss the challenges faced in this field. Furthermore, we suggest potential directions for future research from the perspective of computer science. Our goal is to assist researchers in computational chemistry, computer science, and related fields in easily tracking the latest advancements, providing a clear overview and roadmap of this area. We also organize and make accessible relevant resources, including article lists and datasets, in an open repository at https://github.com/LuckyGirl-XU/Awesome-Artificial-Intelligence-Empowered-Catalyst-Discovery.
Problem

Research questions and friction points this paper is trying to address.

AI accelerates catalyst discovery
Survey covers ML to LLMs
Addresses homogeneous and heterogeneous catalysts
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI techniques for catalyst discovery
Large Language Models application
Unified categorization for catalysts
🔎 Similar Papers
No similar papers found.
Yuanyuan Xu
Yuanyuan Xu
University of New South Wales
Graph Neural NetworksBig Data
H
Hanchen Wang
Australian Artificial Intelligence Institute, University of Technology Sydney, Australia
W
Wenjie Zhang
School of Computer Science and Engineering, The University of New South Wales, Australia
L
Lexing Xie
School of Computing, Australian National University, Australia
Yin Chen
Yin Chen
Lecturer in Mathematics at University of Saskatchewan
Invariant theoryLie theoryCommutative algebraApplied algebraic geometry
Flora Salim
Flora Salim
Professor, CSE, UNSW
Machine LearningTime SeriesSpatiotemporalUbiCompFoundation Models
Y
Ying Zhang
Australian Artificial Intelligence Institute, University of Technology Sydney, Australia
Justin Gooding
Justin Gooding
The University of New South Wales
surface chemistryelectrochemistrybiosensorsnanomedicinenanotechnology
Toby Walsh
Toby Walsh
Professor, UNSW and CSIRO Data61
AIConstraintsSatisfiabilityOptimisationComputational Social Choice