Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

📅 2024-11-20
đŸ›ïž arXiv.org
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Scientific productivity in materials science and chemistry faces persistent bottlenecks, necessitating AI-driven acceleration. Method: We organized the first global, cross-time-zone, multi-node LLM hackathon, engaging 34 teams to systematically investigate large language models (LLMs) across seven scientific use cases—property prediction, molecular design, and research automation, among others. Leveraging open-source base models (e.g., Llama, Phi, Mixtral), we integrated domain-specific fine-tuning, retrieval-augmented generation (RAG), tool learning, structured output constraints, and scientific knowledge graph enhancement. Contribution/Results: We propose a novel “general-purpose AI foundation + rapid scientific prototyping platform” dual-role paradigm, empirically validating end-to-end LLM support for real-world research tasks. The hackathon yielded 34 fully reproducible, open-source projects—each accompanied by code and concise technical reports—demonstrating substantial improvements in usability and practical efficacy of LLMs for scientific tasks compared to 2023 baselines.

Technology Category

Application Category

📝 Abstract
Here, we present the outcomes from the second Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry, which engaged participants across global hybrid locations, resulting in 34 team submissions. The submissions spanned seven key application areas and demonstrated the diverse utility of LLMs for applications in (1) molecular and material property prediction; (2) molecular and material design; (3) automation and novel interfaces; (4) scientific communication and education; (5) research data management and automation; (6) hypothesis generation and evaluation; and (7) knowledge extraction and reasoning from scientific literature. Each team submission is presented in a summary table with links to the code and as brief papers in the appendix. Beyond team results, we discuss the hackathon event and its hybrid format, which included physical hubs in Toronto, Montreal, San Francisco, Berlin, Lausanne, and Tokyo, alongside a global online hub to enable local and virtual collaboration. Overall, the event highlighted significant improvements in LLM capabilities since the previous year's hackathon, suggesting continued expansion of LLMs for applications in materials science and chemistry research. These outcomes demonstrate the dual utility of LLMs as both multipurpose models for diverse machine learning tasks and platforms for rapid prototyping custom applications in scientific research.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Material Science
Chemistry
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Materials Science and Chemistry
Research Acceleration
🔎 Similar Papers
No similar papers found.
Y
Yoel Zimmermann
Affiliation not specified in the text
A
Adib Bazgir
Affiliation not specified in the text
Z
Zartashia Afzal
University of the Punjab
F
Fariha Agbere
University of Maryland, Baltimore County
Q
Qianxiang Ai
Massachusetts Institute of Technology
Nawaf Alampara
Nawaf Alampara
PhD Researcher, Friedrich Schiller University Jena
machine learningai4scienceaccelerating researchcomputational material science
A
Alexander Al-Feghali
Affiliation not specified in the text
Mehrad Ansari
Mehrad Ansari
University of Rochester
Applied AIMaterials Design & DiscoveryComputational Fluid Dynamics
D
Dmytro Antypov
Affiliation not specified in the text
A
Amro Aswad
Affiliation not specified in the text
J
Jiaru Bai
Affiliation not specified in the text
V
Viktoriia Baibakova
Affiliation not specified in the text
D
Devi Dutta Biswajeet
Affiliation not specified in the text
Erik Bitzek
Erik Bitzek
Affiliation not specified in the text
J
Joshua D. Bocarsly
Affiliation not specified in the text
A
Anna Borisova
Affiliation not specified in the text
A
Andres M. Bran
Affiliation not specified in the text
L
L. Catherine Brinson
Affiliation not specified in the text
M
Marcel Moran Calderon
Affiliation not specified in the text
A
Alessandro Canalicchio
Affiliation not specified in the text
V
Victor Chen
Affiliation not specified in the text
Yuan Chiang
Yuan Chiang
UC Berkeley, Lawrence Berkeley National Laboratory
geometric deep learningcomputational materials sciencematerials theoryAI for Science
Defne Circi
Defne Circi
Affiliation not specified in the text
B
Benjamin Charmes
Affiliation not specified in the text
V
Vikrant Chaudhary
Affiliation not specified in the text
Z
Zizhang Chen
Affiliation not specified in the text
M
Min-Hsueh Chiu
Affiliation not specified in the text
Judith Clymo
Judith Clymo
University Of California, Santa Cruz
K
Kedar Dabhadkar
Affiliation not specified in the text
N
Nathan Daelman
Affiliation not specified in the text
A
Archit Datar
Affiliation not specified in the text
M
Matthew L. Evans
Affiliation not specified in the text
M
Maryam Ghazizade Fard
Affiliation not specified in the text
G
Giuseppe Fisicaro
Affiliation not specified in the text
A
Abhijeet Sadashiv Gangan
Affiliation not specified in the text
J
Janine George
Friedrich-Schiller-UniversitÀt Jena
J
Jose D. Cojal Gonzalez
Affiliation not specified in the text
M
Michael Gotte
Affiliation not specified in the text
A
Ankur K. Gupta
Affiliation not specified in the text
H
Hassan Harb
Affiliation not specified in the text
Pengyu Hong
Pengyu Hong
Affiliation not specified in the text
A
Abdelrahman Ibrahim
Friedrich-Schiller-UniversitÀt Jena
A
Ahmed Ilyas
Affiliation not specified in the text
Alishba Imran
Alishba Imran
UC Berkeley
Machine LearningRoboticsMaterials ScienceBiology
K
Kevin Ishimwe
University of Maryland, Baltimore County
R
Ramsey Issa
Friedrich-Schiller-UniversitÀt Jena
Kevin Maik Jablonka
Kevin Maik Jablonka
FSU Jena & HIPOLE Jena
digital chemistryAI for sciencemodel evaluations
Colin Jones
Colin Jones
Associate Professor of Engineering, École Polytechnique FĂ©dĂ©rale de Lausanne (EPFL)
Model Predictive ControlControlOptimizationAutomatic ControlControl Systems
Tyler R. Josephson
Tyler R. Josephson
Assistant Professor, Chemical, Biochemical, and Environmental Engineering
AI & Theory-Oriented Molecular Science
G
Gergely Juhasz
Affiliation not specified in the text
S
Sarthak Kapoor
Affiliation not specified in the text
R
Rongda Kang
Affiliation not specified in the text
G
Ghazal Khalighinejad
Affiliation not specified in the text
S
Sartaaj Takrim Khan
University of Maryland, Baltimore County
S
Sascha Klawohn
Affiliation not specified in the text
S
Suneel Kuman
Affiliation not specified in the text
A
Alvin Noe Ladines
Affiliation not specified in the text
S
Sarom Leang
Affiliation not specified in the text
M
Magdalena Lederbauer
Affiliation not specified in the text
S
Sheng-Lun Mark Liao
Affiliation not specified in the text
H
Hao Liu
Affiliation not specified in the text
X
Xuefeng Liu
Affiliation not specified in the text
S
Stanley Lo
University of Maryland, Baltimore County
Sandeep Madireddy
Sandeep Madireddy
Mathematics and Computer Science Division, Argonne National Laboratory
Artificial IntelligenceAI for ScienceMachine LearningFoundation ModelsProbabilistic AI
P
Piyush Ranjan Maharana
Affiliation not specified in the text
S
Shagun Maheshwari
Affiliation not specified in the text
S
Soroush Mahjoubi
Massachusetts Institute of Technology
J
José A. Mårquez
Friedrich-Schiller-UniversitÀt Jena
Rob Mills
Rob Mills
Affiliation not specified in the text
T
Trupti Mohanty
Friedrich-Schiller-UniversitÀt Jena
B
Bernadette Mohr
University of Maryland, Baltimore County
S
Seyed Mohamad Moosavi
Affiliation not specified in the text
A
Alexander Moßhammer
Affiliation not specified in the text
A
Amirhossein D. Naghdi
Affiliation not specified in the text
A
Aakash Naik
Friedrich-Schiller-UniversitÀt Jena
O
Oleksandr Narykov
Affiliation not specified in the text
H
Hampus NÀström
Affiliation not specified in the text
X
Xuan Vu Nguyen
Affiliation not specified in the text
Xinyi Ni
Xinyi Ni
Brandeis University
LLM AgentRAGInformation ExtractionPEFT
D
Dana O'Connor
Affiliation not specified in the text
T
Teslim Olayiwola
Affiliation not specified in the text
Federico Ottomano
Federico Ottomano
Postdoctoral researcher in Generative AI at Imperial College London
Deep LearningAI for ScienceMachine Learning Theory
A
Aleyna Beste Ozhan
Massachusetts Institute of Technology
S
Sebastian Pagel
Affiliation not specified in the text
C
Chiku Parida
University of Maryland, Baltimore County
J
Jaehee Park
Affiliation not specified in the text
Vraj Patel
Vraj Patel
Affiliation not specified in the text
E
Elena Patyukova
Friedrich-Schiller-UniversitÀt Jena
M
Martin Hoffmann Petersen
Affiliation not specified in the text
L
Luis Pinto
Affiliation not specified in the text
J
José M. Pizarro
Affiliation not specified in the text
D
Dieter Plessers
Affiliation not specified in the text
T
Tapashree Pradhan
Affiliation not specified in the text
U
Utkarsh Pratiush
Affiliation not specified in the text
C
Charishma Puli
University of Maryland, Baltimore County
A
Andrew Qin
Affiliation not specified in the text
M
Mahyar Rajabi
University of Maryland, Baltimore County
F
Francesco Ricci
Affiliation not specified in the text
E
Elliot Risch
Affiliation not specified in the text
M
Martino RĂ­os-GarcĂ­a
Friedrich-Schiller-UniversitÀt Jena
A
Aritra Roy
Affiliation not specified in the text
T
Tehseen Rug
Affiliation not specified in the text
H
Hasan M Sayeed
Friedrich-Schiller-UniversitÀt Jena
Markus Scheidgen
Markus Scheidgen
Affiliation not specified in the text
Mara Schilling-Wilhelmi
Mara Schilling-Wilhelmi
Friedrich-Schiller-UniversitÀt Jena
Polymer ChemistryMachine Learning
M
Marcel Schloz
Affiliation not specified in the text
F
Fabian Schoppach
Affiliation not specified in the text
Julia Schumann
Julia Schumann
Affiliation not specified in the text
Philippe Schwaller
Philippe Schwaller
Assistant Professor, Laboratory of Artificial Chemical Intelligence - EPFL
Deep LearningML for ChemistryReaction PredictionSynthesis PlanningAccelerated Discovery
M
Marcus Schwarting
Affiliation not specified in the text
S
Samiha Sharlin
University of Maryland, Baltimore County
K
Kevin Shen
Affiliation not specified in the text
J
Jiale Shi
Massachusetts Institute of Technology
P
Pradip Si
Affiliation not specified in the text
Jennifer D'Souza
Jennifer D'Souza
TIB Leibniz Information Centre for Science and Technology
Natural Language ProcessingScientific Knowledge ExtractionLLM EvaluationScientometrics
T
Taylor Sparks
Friedrich-Schiller-UniversitÀt Jena
S
Suraj Sudhakar
Affiliation not specified in the text
L
Leopold Talirz
Affiliation not specified in the text
Dandan Tang
Dandan Tang
University of Virginia
Structural Equaltion modelMissing dataBayesian StatisticsApplication of statistics in
O
Olga Taran
Affiliation not specified in the text
C
Carla Terboven
Affiliation not specified in the text
M
Mark Tropin
Affiliation not specified in the text
A
Anastasiia Tsymbal
Affiliation not specified in the text
K
Katharina Ueltzen
Affiliation not specified in the text
P
Pablo Andres Unzueta
Affiliation not specified in the text
Archit Vasan
Archit Vasan
Researcher at Argonne National Laboratory
BiophysicsDrug DiscoveryMachine LearningMolecular Simulation
Tirtha Vinchurkar
Tirtha Vinchurkar
Graduate Student at Carnegie Mellon University
T
Trung Vo
Affiliation not specified in the text
Gabriel Vogel
Gabriel Vogel
PhD candidate, TU Delft
C
Christoph Volker
Affiliation not specified in the text
J
Jan Weinreich
Affiliation not specified in the text
F
Faradawn Yang
Affiliation not specified in the text
Mohd Zaki
Mohd Zaki
Postdoctoral Researcher, Hopkins Extreme Materials Institute, Johns Hopkins University
Civil EngineeringMaterial ScienceMachine Learning
C
Chi Zhang
Friedrich-Schiller-UniversitÀt Jena
S
Sylvester Zhang
Friedrich-Schiller-UniversitÀt Jena
Weijie Zhang
Weijie Zhang
University of Kansas Medical Center
Inverse planningparticle therapy
Ruijie Zhu
Ruijie Zhu
University of Science and Technology of China
3d vision
S
Shang Zhu
Affiliation not specified in the text
Jan Janssen
Jan Janssen
Max Planck Institute for Sustainable Materials
I
Ian Foster
Affiliation not specified in the text
Ben Blaiszik
Ben Blaiszik
University of Chicago and Argonne National Laboratory
AI and ML for sciencelaboratory automationmaterials dataself-healing materialsenergy storage materials