Projection-based multifidelity linear regression for data-scarce applications

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Modeling high-dimensional outputs with expensive, high-fidelity evaluations and extremely limited high-fidelity data (≤10 samples) remains challenging. Method: This paper proposes two multi-fidelity linear regression surrogate models based on principal component analysis (PCA) projection. First, PCA reduces output dimensionality to construct a low-dimensional principal component basis. Second, two data augmentation strategies are introduced: direct concatenation of high- and low-fidelity data, and explicit linear correction. Third, a fidelity-weighted least-squares training scheme is adopted, with systematic comparison of multiple weighting strategies. Contribution/Results: Under the low-data regime, the proposed methods improve median prediction accuracy by 3–12% over single-fidelity baselines, while maintaining comparable computational cost. The approach significantly enhances both efficiency and accuracy for high-dimensional output surrogate modeling.

Technology Category

Application Category

📝 Abstract

Surrogate modeling for systems with high-dimensional quantities of interest remains challenging, particularly when training data are costly to acquire. This work develops multifidelity methods for multiple-input multiple-output linear regression targeting data-limited applications with high-dimensional outputs. Multifidelity methods integrate many inexpensive low-fidelity model evaluations with limited, costly high-fidelity evaluations. We introduce two projection-based multifidelity linear regression approaches that leverage principal component basis vectors for dimensionality reduction and combine multifidelity data through: (i) a direct data augmentation using low-fidelity data, and (ii) a data augmentation incorporating explicit linear corrections between low-fidelity and high-fidelity data. The data augmentation approaches combine high-fidelity and low-fidelity data into a unified training set and train the linear regression model through weighted least squares with fidelity-specific weights. Various weighting schemes and their impact on regression accuracy are explored. The proposed multifidelity linear regression methods are demonstrated on approximating the surface pressure field of a hypersonic vehicle in flight. In a low-data regime of no more than ten high-fidelity samples, multifidelity linear regression achieves approximately 3% - 12% improvement in median accuracy compared to single-fidelity methods with comparable computational cost.

Problem

Research questions and friction points this paper is trying to address.

Surrogate modeling for high-dimensional outputs with scarce data

Multifidelity linear regression combining low and high-fidelity data

Improving accuracy in data-limited applications with dimensionality reduction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Projection-based multifidelity linear regression methods

Principal component basis for dimensionality reduction

Weighted least squares with fidelity-specific weights

🔎 Similar Papers

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits