π€ AI Summary
This work addresses the challenge of effectively leveraging pre-trained black-box models for personalized nonparametric regression in data-scarce settings. The authors develop a theoretical framework and propose a nonparametric regression algorithm that integrates information from black-box models while maintaining robustness even when these pre-trained models fail. Under limited sample regimes, the proposed method achieves the minimax-optimal convergence rate established within the framework. The theoretical analysis draws upon nonparametric regression theory, minimax theory, and statistical learning theory. Empirical evaluations on both synthetic data and the California housing dataset demonstrate the algorithmβs superior performance in low-sample scenarios, confirming its practical efficacy and theoretical guarantees.
π Abstract
Recent advances in large-scale models, including deep neural networks and large language models, have substantially improved performance across a wide range of learning tasks. The widespread availability of such pre-trained models creates new opportunities for data-efficient statistical learning, provided they can be effectively integrated into downstream tasks. Motivated by this setting, we study few-shot personalization, where a pre-trained black-box model is adapted to a target domain using a limited number of samples. We develop a theoretical framework for few-shot personalization in nonparametric regression and propose algorithms that can incorporate a black-box pre-trained model into the regression procedure. We establish the minimax optimal rate for the personalization problem and show that the proposed method attains this rate. Our results clarify the statistical benefits of leveraging pre-trained models under sample scarcity and provide robustness guarantees when the pre-trained model is not informative. We illustrate the finite-sample performance of the methods through simulations and an application to the California housing dataset with several pre-trained models.