Large Language Model-Driven Database for Thermoelectric Materials

📅 2024-12-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Thermoelectric materials research has long been hindered by the absence of comprehensive, reliable, and structured databases. To address this, we present the first open-source, high-quality, and scalable thermoelectric materials database, encompassing 7,123 materials with curated data on chemical composition, crystal structure, Seebeck coefficient, electrical and thermal conductivity, power factor, and ZT values. Methodologically, we introduce GPTArticleExtractor—a novel LLM-driven workflow enabling fully automated literature parsing and data curation for thermoelectrics—integrating intelligent full-text extraction from Elsevier publications and structured mapping of heterogeneous multi-source data. This approach overcomes longstanding bottlenecks in manual database construction. The database has already enabled multiple data-driven studies on thermoelectric property prediction and optimization, substantially accelerating the discovery of high-performance thermoelectric materials.

Technology Category

Application Category

📝 Abstract
Thermoelectric materials provide a sustainable way to convert waste heat into electricity. However, data-driven discovery and optimization of these materials are challenging because of a lack of a reliable database. Here we developed a comprehensive database of 7,123 thermoelectric compounds, containing key information such as chemical composition, structural detail, seebeck coefficient, electrical and thermal conductivity, power factor, and figure of merit (ZT). We used the GPTArticleExtractor workflow, powered by large language models (LLM), to extract and curate data automatically from the scientific literature published in Elsevier journals. This process enabled the creation of a structured database that addresses the challenges of manual data collection. The open access database could stimulate data-driven research and advance thermoelectric material analysis and discovery.
Problem

Research questions and friction points this paper is trying to address.

Thermoelectric materials
Database deficiency
Research efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thermoelectric Materials Database
GPTArticleExtractor System
Automated Information Extraction
🔎 Similar Papers
No similar papers found.
Suman Itani
Suman Itani
PhD Candidate, University of New Hampshire
Condensed Matter PhysicsAIMachine LearningDeep LearningMaterial Informatics
Y
Yibo Zhang
Department of Physics and Astronomy, University of New Hampshire, 9 Library Way, Durham, 03824, NH, USA; Department of Chemistry, University of New Hampshire, 23 Academic Way, Durham, 03824, NH, USA
Jiadong Zang
Jiadong Zang
University of New Hampshire
Condensed Matter Theory