Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reliability of machine learning interatomic potentials (MLIPs) in high-throughput materials screening, which often leads to the omission of density functional theory (DFT)-stable candidates. To this end, the authors propose a three-stage falsifiable safety certification framework that integrates adversarial sampling to identify blind spots, bootstrap confidence envelopes to refine uncertainty quantification, and formal verification in Lean 4 to certify critical predictions—thereby establishing the first provably falsifiable reliability guarantee for MLIPs. Evaluated on a benchmark of 25,000 materials, the method substantially improves recall of stable compounds, yielding 62 additional thermoelectric candidates (a 25% increase in discovery rate). The approach achieves an AUC-ROC of 0.938 for failure prediction and demonstrates cross-model transferability with an AUC near 0.70, revealing architecture-specific failure modes across different MLIPs.

Technology Category

Application Category

📝 Abstract
Machine-learned interatomic potentials (MLIPs) are deployed for high-throughput materials screening without formal reliability guarantees. We show that a single MLIP used as a stability filter misses 93% of density functional theory (DFT)-stable materials (recall 0.07) on a 25,000-material benchmark. Proof-Carrying Materials (PCM) closes this gap through three stages: adversarial falsification across compositional space, bootstrap envelope refinement with 95% confidence intervals, and Lean 4 formal certification. Auditing CHGNet, TensorNet and MACE reveals architecture-specific blind spots with near-zero pairwise error correlations (r <= 0.13; n = 5,000), confirmed by independent Quantum ESPRESSO validation (20/20 converged; median DFT/CHGNet force ratio 12x). A risk model trained on PCM-discovered features predicts failures on unseen materials (AUC-ROC = 0.938 +/- 0.004) and transfers across architectures (cross-MLIP AUC-ROC ~ 0.70; feature importance r = 0.877). In a thermoelectric screening case study, PCM-audited protocols discover 62 additional stable materials missed by single-MLIP screening - a 25% improvement in discovery yield.
Problem

Research questions and friction points this paper is trying to address.

machine-learned interatomic potentials
reliability guarantees
materials screening
stability prediction
falsifiable safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proof-Carrying Materials
machine-learned interatomic potentials
adversarial falsification
formal certification
uncertainty quantification
🔎 Similar Papers
No similar papers found.
A
Abhinaba Basu
Indian Institute of Information Technology Allahabad (IIITA), Prayagraj, India
Pavan Chakraborty
Pavan Chakraborty
Indian Institute of Information Technology Allahabad
Artificial IntelligenceRobotics & Instrumentation