Foundational Large Language Models for Materials Research

📅 2024-12-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of information overload and inefficient knowledge extraction from materials science literature—which hinder research progress—this study introduces LLaMat, a domain-specific foundational large language model for materials science. LLaMat is developed via continued pretraining of the LLaMA architecture on large-scale materials literature and crystallographic CIF data, enabling deep adaptation for text understanding, structured information extraction, and crystal structure generation. We first identify the critical impact of base model selection, demonstrating that LLaMA-2 outperforms LLaMA-3 in domain-specific performance. We further propose LLaMat-CIF, a specialized variant achieving both high stability and broad coverage in periodic-table-scale crystal generation, overcoming rigidity limitations of conventional domain adaptation. Experiments show that LLaMat significantly surpasses general-purpose LLMs on materials NLP and information extraction tasks; LLaMat-CIF successfully generates thousands of thermodynamically stable novel crystals, covering over 90% of chemical elements, with a 42% improvement in structural prediction accuracy.

Technology Category

Application Category

📝 Abstract
Materials discovery and development are critical for addressing global challenges. Yet, the exponential growth in materials science literature comprising vast amounts of textual data has created significant bottlenecks in knowledge extraction, synthesis, and scientific reasoning. Large Language Models (LLMs) offer unprecedented opportunities to accelerate materials research through automated analysis and prediction. Still, their effective deployment requires domain-specific adaptation for understanding and solving domain-relevant tasks. Here, we present LLaMat, a family of foundational models for materials science developed through continued pretraining of LLaMA models on an extensive corpus of materials literature and crystallographic data. Through systematic evaluation, we demonstrate that LLaMat excels in materials-specific NLP and structured information extraction while maintaining general linguistic capabilities. The specialized LLaMat-CIF variant demonstrates unprecedented capabilities in crystal structure generation, predicting stable crystals with high coverage across the periodic table. Intriguingly, despite LLaMA-3's superior performance in comparison to LLaMA-2, we observe that LLaMat-2 demonstrates unexpectedly enhanced domain-specific performance across diverse materials science tasks, including structured information extraction from text and tables, more particularly in crystal structure generation, a potential adaptation rigidity in overtrained LLMs. Altogether, the present work demonstrates the effectiveness of domain adaptation towards developing practically deployable LLM copilots for materials research. Beyond materials science, our findings reveal important considerations for domain adaptation of LLMs, such as model selection, training methodology, and domain-specific performance, which may influence the development of specialized scientific AI systems.
Problem

Research questions and friction points this paper is trying to address.

Material Science
Literature Analysis
Knowledge Discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLaMat
Materials Science
Crystal Structure Generation
🔎 Similar Papers
No similar papers found.
V
Vaibhav Mishra
Department of Computer Science and Engineering, Indian Institute of Technology Delhi
S
Somaditya Singh
Department of Computer Science and Engineering, Indian Institute of Technology Delhi
D
D. Ahlawat
Department of Computer Science and Engineering, Indian Institute of Technology Delhi
Mohd Zaki
Mohd Zaki
Postdoctoral Researcher, Hopkins Extreme Materials Institute, Johns Hopkins University
Civil EngineeringMaterial ScienceMachine Learning
V
Vaibhav Bihani
Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi
H
Hargun Singh Grover
Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi
B
Biswajit Mishra
Cerebras Systems, Inc.
Santiago Miret
Santiago Miret
Lila Sciences
Mausam
Mausam
Professor of Computer Science & Engineering, Indian Institute of Technology Delhi
Artificial IntelligenceNeuro-Symbolic AIInformation ExtractionKnowledge GraphAI for Materials
N
N. M. A. Krishnan
Department of Civil Engineering, Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi