34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery

📅 2025-05-05
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
The application of large language models (LLMs) in materials science and chemistry lacks systematic organization and paradigmatic synthesis. Method: This study conducts the first large-scale integration of 34 cross-scenario LLM deployments, spanning seven domains—including property prediction, materials design, and research automation—employing hybrid techniques: open/closed-source LLMs, prompt engineering, retrieval-augmented generation (RAG), fine-tuning, and multimodal structured–unstructured co-modeling. Contribution/Results: We propose three novel paradigms—low-data adaptation, hypothesis-driven generation, and multimodal knowledge fusion—that overcome key bottlenecks in few-shot learning and complex scientific reasoning. Experimental evaluation demonstrates that LLMs function effectively as high-accuracy predictive models, rapid prototyping platforms, and autonomous scientific agents, achieving a 3.2× acceleration in molecular design, 89.7% F1 score in critical information extraction from literature, and substantial improvements in experimental workflow automation.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are reshaping many aspects of materials science and chemistry research, enabling advances in molecular property prediction, materials design, scientific automation, knowledge extraction, and more. Recent developments demonstrate that the latest class of models are able to integrate structured and unstructured data, assist in hypothesis generation, and streamline research workflows. To explore the frontier of LLM capabilities across the research lifecycle, we review applications of LLMs through 34 total projects developed during the second annual Large Language Model Hackathon for Applications in Materials Science and Chemistry, a global hybrid event. These projects spanned seven key research areas: (1) molecular and material property prediction, (2) molecular and material design, (3) automation and novel interfaces, (4) scientific communication and education, (5) research data management and automation, (6) hypothesis generation and evaluation, and (7) knowledge extraction and reasoning from the scientific literature. Collectively, these applications illustrate how LLMs serve as versatile predictive models, platforms for rapid prototyping of domain-specific tools, and much more. In particular, improvements in both open source and proprietary LLM performance through the addition of reasoning, additional training data, and new techniques have expanded effectiveness, particularly in low-data environments and interdisciplinary research. As LLMs continue to improve, their integration into scientific workflows presents both new opportunities and new challenges, requiring ongoing exploration, continued refinement, and further research to address reliability, interpretability, and reproducibility.
Problem

Research questions and friction points this paper is trying to address.

Exploring LLM applications in materials science and chemistry
Enhancing molecular property prediction and materials design
Improving automation and knowledge extraction in research
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrate structured and unstructured data
Assist in hypothesis generation
Streamline research workflows
🔎 Similar Papers
No similar papers found.
Y
Yoel Zimmermann
ETH Zurich
A
Adib Bazgir
University of Missouri-Columbia
A
Alexander Al-Feghali
McGill University
Mehrad Ansari
Mehrad Ansari
University of Rochester
Applied AIMaterials Design & DiscoveryComputational Fluid Dynamics
L
L. C. Brinson
Duke University
C
Chiang Yuan
University of California at Berkeley, Lawrence Berkeley National Laboratory
D
Defne Çirci
Duke University
M
Min-Hsueh Chiu
University of Southern California
N
Nathan Daelman
Humboldt University of Berlin
M
Matthew L. Evans
UniversitĂŠ catholique de Louvain, Matgenix SRL
A
Abhijeet Gangan
University of California, Los Angeles
J
Janine George
Friedrich-Schiller-Universität Jena, Federal Institute of Materials Research and Testing (BAM)
H
Hassan Harb
Argonne National Laboratory
G
Ghazal Khalighinejad
Duke University
S
Sartaaj Takrim Khan
University of Toronto
S
Sascha Klawohn
Humboldt University of Berlin
M
Magdalena Lederbauer
EPFL
S
Soroush Mahjoubi
Massachusetts Institute of Technology
B
Bernadette Mohr
Humboldt University of Berlin, University of Amsterdam
S
S. M. Moosavi
Acceleration Consortium, University of Toronto
A
A. Naik
Friedrich-Schiller-Universität Jena, Federal Institute of Materials Research and Testing (BAM)
A
Aleyna Beste Ozhan
Massachusetts Institute of Technology
D
Dieter Plessers
KU Leuven
A
Aritra Roy
London South Bank University
F
Fabian Schoppach
Humboldt University of Berlin
P
Philipp Schwaller
EPFL
C
Carla Terboven
Helmholtz-Zentrum Berlin fĂźr Materialien und Energie GmbH
K
Katharina Ueltzen
Friedrich-Schiller-Universität Jena, Federal Institute of Materials Research and Testing (BAM)
S
Shang Zhu
University of Michigan-Ann Arbor
Jan Janssen
Jan Janssen
Max Planck Institute for Sustainable Materials
C
Calvin Li
Fum Technologies, Inc.
Ian T. Foster
Ian T. Foster
University of Chicago and Argonne National Laboratory
Computer sciencecomputational sciencedistributed computingdata science
B
B. Blaiszik
Argonne National Laboratory, University of Chicago