🤖 AI Summary
This work addresses the vulnerability of high-dimensional surrogate models to distribution shifts during deployment, which often leads to severe performance degradation, and the instability of existing test-time adaptation (TTA) methods in high-dimensional regression tasks. To overcome these limitations, the authors propose a novel TTA framework grounded in the D-optimality criterion, which selects and retains the most informative statistical features to enable stable online adaptation and parameter refinement of pre-trained surrogate models. This approach achieves the first effective and robust TTA for high-dimensional simulation-based regression and generative design optimization, yielding up to a 7% improvement in out-of-distribution performance on the SIMSHIFT and EngiBench benchmarks with negligible computational overhead, thereby significantly enhancing model generalization and deployment reliability.
📝 Abstract
Machine learning surrogates are increasingly used in engineering to accelerate costly simulations, yet distribution shifts between training and deployment often cause severe performance degradation (e.g., unseen geometries or configurations). Test-Time Adaptation (TTA) can mitigate such shifts, but existing methods are largely developed for lower-dimensional classification with structured outputs and visually aligned input-output relationships, making them unstable for the high-dimensional, unstructured and regression problems common in simulation. We address this challenge by proposing a TTA framework based on storing maximally informative (D-optimal) statistics, which jointly enables stable adaptation and principled parameter selection at test time. When applied to pretrained simulation surrogates, our method yields up to 7% out-of-distribution improvements at negligible computational cost. To the best of our knowledge, this is the first systematic demonstration of effective TTA for high-dimensional simulation regression and generative design optimization, validated on the SIMSHIFT and EngiBench benchmarks.