🤖 AI Summary
Complex systems—such as autonomous materials design and smart factory configuration—exhibit continuous design spaces punctuated by discontinuities across regions, posing significant challenges for surrogate modeling and optimization.
Method: This paper proposes a piecewise Gaussian process (Jump GP) surrogate model and a bias-aware active learning framework. We introduce the Jump GP model for the first time and develop a joint bias–variance estimation mechanism that explicitly incorporates model bias into the acquisition function. Coupled with adaptive piecewise spatial partitioning, an enhanced expected improvement criterion, and uncertainty-weighted sampling, the framework enables robust, discontinuity-aware modeling and efficient data-driven exploration.
Contribution/Results: Evaluated on multiple benchmark and real-world industrial simulation tasks, the method achieves significantly lower prediction error and reduced sample complexity compared to standard GP-based active learning. Optimization efficiency improves by over 35%, marking a dual advance in theoretical guarantees and practical performance for discontinuous surrogate modeling.
📝 Abstract
Active learning of Gaussian process (GP) surrogates has been useful for optimizing experimental designs for physical/computer simulation experiments, and for steering data acquisition schemes in machine learning. In this paper, we develop a method for active learning of piecewise, Jump GP surrogates. Jump GPs are continuous within, but discontinuous across, regions of a design space, as required for applications spanning autonomous materials design, configuration of smart factory systems, and many others. Although our active learning heuristics are appropriated from strategies originally designed for ordinary GPs, we demonstrate that additionally accounting for model bias, as opposed to the usual model uncertainty, is essential in the Jump GP context. Toward that end, we develop an estimator for bias and variance of Jump GP models. Illustrations, and evidence of the advantage of our proposed methods, are provided on a suite of synthetic benchmarks, and real-simulation experiments of varying complexity.