Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

📅 2024-06-08
🏛️ Annual Meeting of the Association for Computational Linguistics
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
This study evaluates GPT-4’s capability to extract structured information on-demand from materials science literature, specifically assessing its zero-shot fidelity in reproducing two manually curated materials datasets. Methodologically, we employ an interdisciplinary, domain-expert–driven error attribution framework—integrated with rigorous human annotation—to systematically diagnose model output deviations across dimensions including numerical accuracy, contextual disambiguation, and unit standardization. Results reveal significant fidelity bottlenecks in GPT-4’s scientific information extraction (IE), particularly in precision-critical tasks. Our key contribution is the first application of deep expert-led error analysis to scientific IE evaluation, enabling quantitative characterization of large language models’ reliability boundaries in authentic research settings. We further propose a scalable, expert-validated benchmark for scientific IE assessment, advancing methodological foundations for high-fidelity AI-assisted scientific discovery.

Technology Category

Application Category

📝 Abstract
We explore the ability of GPT-4 to perform ad-hoc schema based information extraction from scientific literature. We assess specifically whether it can, with a basic prompting approach, replicate two existing material science datasets, given the manuscripts from which they were originally manually extracted. We employ materials scientists to perform a detailed manual error analysis to assess where the model struggles to faithfully extract the desired information, and draw on their insights to suggest research directions to address this broadly important task.
Problem

Research questions and friction points this paper is trying to address.

Assessing GPT-4's ability for schema-based scientific information extraction
Evaluating replication of material science datasets via basic prompting
Identifying model limitations through manual error analysis by experts
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-4 for ad-hoc schema extraction
Basic prompting replicates material datasets
Manual error analysis guides improvements
🔎 Similar Papers
No similar papers found.
S
Satanu Ghosh
University of New Hampshire
N
Neal R. Brodnik
University of California, Santa Barbara
C
Carolina Frey
University of California, Santa Barbara
C
Collin Holgate
University of California, Santa Barbara
T
Tresa M. Pollock
University of California, Santa Barbara
S
Samantha Daly
University of California, Santa Barbara
Samuel Carton
Samuel Carton
Assistant Professor, University of New Hampshire
Computer science