🤖 AI Summary
Material discovery faces challenges stemming from the vastness of chemical and structural spaces and frequent conflicts among multiple objectives—e.g., target properties versus synthetic feasibility. To address this, we propose LLEMA, the first framework that tightly integrates large language models’ (LLMs) scientific reasoning capabilities with chemistry-aware evolutionary rules. LLEMA unifies synthetic accessibility and multi-objective trade-offs via LLM-guided constrained candidate generation, memory-based feedback, and surrogate-augmented multi-objective evaluation within an iterative search loop. It synergistically combines crystal-structure-constrained generation, surrogate-enhanced property prediction, and multi-objective evolutionary optimization. Evaluated on 14 real-world materials design tasks, LLEMA significantly outperforms pure LLMs and state-of-the-art generative models: it achieves markedly improved Pareto-front quality and up to a 3.2× higher hit rate. This establishes a new paradigm for rational, data-efficient materials design.
📝 Abstract
Materials discovery requires navigating vast chemical and structural spaces while satisfying multiple, often conflicting, objectives. We present LLM-guided Evolution for MAterials design (LLEMA), a unified framework that couples the scientific knowledge embedded in large language models with chemistry-informed evolutionary rules and memory-based refinement. At each iteration, an LLM proposes crystallographically specified candidates under explicit property constraints; a surrogate-augmented oracle estimates physicochemical properties; and a multi-objective scorer updates success/failure memories to guide subsequent generations. Evaluated on 14 realistic tasks spanning electronics, energy, coatings, optics, and aerospace, LLEMA discovers candidates that are chemically plausible, thermodynamically stable, and property-aligned, achieving higher hit-rates and stronger Pareto fronts than generative and LLM-only baselines. Ablation studies confirm the importance of rule-guided generation, memory-based refinement, and surrogate prediction. By enforcing synthesizability and multi-objective trade-offs, LLEMA delivers a principled pathway to accelerate practical materials discovery.
Code: https://github.com/scientific-discovery/LLEMA