An information-matching approach to optimal experimental design and active learning

📅 2024-11-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost and redundancy of data acquisition for predicting quantities of interest (QoIs) in high-dimensional parametric models, this paper proposes a QoI-guided information-matching active learning framework. The method integrates parameter observability-driven dimensionality reduction, QoI sensitivity modeling, and an information-gain-based query strategy. Crucially, it constrains the Fisher information matrix to the QoI-relevant parameter subspace—avoiding full-parameter identification—while formulating a scalable convex optimization framework for efficient experimental design in large-scale models. Experiments across power systems, underwater acoustics, and materials science demonstrate that high-accuracy QoI prediction is achieved with only a small number of optimally selected samples. This significantly reduces data collection costs and validates the framework’s effectiveness and practicality in parameter-constrained scenarios.

Technology Category

Application Category

📝 Abstract
The efficacy of mathematical models heavily depends on the quality of the training data, yet collecting sufficient data is often expensive and challenging. Many modeling applications require inferring parameters only as a means to predict other quantities of interest (QoI). Because models often contain many unidentifiable (sloppy) parameters, QoIs often depend on a relatively small number of parameter combinations. Therefore, we introduce an information-matching criterion based on the Fisher Information Matrix to select the most informative training data from a candidate pool. This method ensures that the selected data contain sufficient information to learn only those parameters that are needed to constrain downstream QoIs. It is formulated as a convex optimization problem, making it scalable to large models and datasets. We demonstrate the effectiveness of this approach across various modeling problems in diverse scientific fields, including power systems and underwater acoustics. Finally, we use information-matching as a query function within an Active Learning loop for material science applications. In all these applications, we find that a relatively small set of optimal training data can provide the necessary information for achieving precise predictions. These results are encouraging for diverse future applications, particularly active learning in large machine learning models.
Problem

Research questions and friction points this paper is trying to address.

Parameter Optimization
Mathematical Modeling
Data Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Matching
Fisher Information Matrix
Active Learning in Large-scale Models
🔎 Similar Papers
No similar papers found.
Y
Yonatan Kurniawan
Brigham Young University, Provo, UT 84602, USA
T
T. Neilsen
Brigham Young University, Provo, UT 84602, USA
B
Benjamin L. Francis
Achilles Heel Technologies, Orem, UT 84097, USA
A
Aleksandar M. Stankovic
SLAC National Accelerator Laboratory, Menlo Park, CA, USA
M
Mingjian Wen
University of Houston, Houston, TX 77204, USA
I
Ilia A. Nikiforov
University of Minnesota, Minneapolis, MN 55455, USA
E
E. Tadmor
University of Minnesota, Minneapolis, MN 55455, USA
V
Vasily V. Bulatov
Lawrence Livermore National Laboratory
Vincenzo Lordi
Vincenzo Lordi
Lawrence Livermore National Laboratory
Computational Materials ScienceSemiconductorsSpectroscopyRenewable EnergyQuantum Information Science
M
M. Transtrum
Brigham Young University, Provo, UT 84602, USA; Achilles Heel Technologies, Orem, UT 84097, USA; SLAC National Accelerator Laboratory, Menlo Park, CA, USA