🤖 AI Summary
Modeling high-dimensional outputs with expensive, high-fidelity evaluations and extremely limited high-fidelity data (≤10 samples) remains challenging.
Method: This paper proposes two multi-fidelity linear regression surrogate models based on principal component analysis (PCA) projection. First, PCA reduces output dimensionality to construct a low-dimensional principal component basis. Second, two data augmentation strategies are introduced: direct concatenation of high- and low-fidelity data, and explicit linear correction. Third, a fidelity-weighted least-squares training scheme is adopted, with systematic comparison of multiple weighting strategies.
Contribution/Results: Under the low-data regime, the proposed methods improve median prediction accuracy by 3–12% over single-fidelity baselines, while maintaining comparable computational cost. The approach significantly enhances both efficiency and accuracy for high-dimensional output surrogate modeling.
📝 Abstract
Surrogate modeling for systems with high-dimensional quantities of interest remains challenging, particularly when training data are costly to acquire. This work develops multifidelity methods for multiple-input multiple-output linear regression targeting data-limited applications with high-dimensional outputs. Multifidelity methods integrate many inexpensive low-fidelity model evaluations with limited, costly high-fidelity evaluations. We introduce two projection-based multifidelity linear regression approaches that leverage principal component basis vectors for dimensionality reduction and combine multifidelity data through: (i) a direct data augmentation using low-fidelity data, and (ii) a data augmentation incorporating explicit linear corrections between low-fidelity and high-fidelity data. The data augmentation approaches combine high-fidelity and low-fidelity data into a unified training set and train the linear regression model through weighted least squares with fidelity-specific weights. Various weighting schemes and their impact on regression accuracy are explored. The proposed multifidelity linear regression methods are demonstrated on approximating the surface pressure field of a hypersonic vehicle in flight. In a low-data regime of no more than ten high-fidelity samples, multifidelity linear regression achieves approximately 3% - 12% improvement in median accuracy compared to single-fidelity methods with comparable computational cost.