🤖 AI Summary
This study addresses the challenge of predictive modeling when predictions themselves alter user behavior, leading to significant degradation in model generalization—particularly in settings where both the sample and the population are subject to intervention by the prediction. The authors integrate performative prediction into the statistical learning theory framework by modeling dynamic distribution shifts induced by predictions through min-max (self-defeating) and min-min (self-fulfilling) risk functionals in Wasserstein space, and derive corresponding generalization error bounds. Their theoretical analysis reveals a fundamental trade-off between intervening in the world and learning from it, yielding the counterintuitive insight that retraining can improve generalization performance. Empirical validation using German labor market data from 1975 to 2017 confirms the tightness of the proposed bounds and underscores the critical impact of performativity on the generalization of policy allocation tasks.
📝 Abstract
Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., only existing users of an app) and/or the whole population (e.g., all potential app users). This raises the question of how well models generalize under performativity. For example, how well can we draw insights about new app users based on existing users when both of them react to the app's predictions? We address this question by embedding performative predictions into statistical learning theory. We prove generalization bounds under performative effects on the sample, on the population, and on both. A key intuition behind our proofs is that in the worst case, the population negates predictions, while the sample deceptively fulfills them. We cast such self-negating and self-fulfilling predictions as min-max and min-min risk functionals in Wasserstein space, respectively. Our analysis reveals a fundamental trade-off between performatively changing the world and learning from it: the more a model affects data, the less it can learn from it. Moreover, our analysis results in a surprising insight on how to improve generalization guarantees by retraining on performatively distorted samples. We illustrate our bounds in a case study on prediction-informed assignments of unemployed German residents to job trainings, drawing upon administrative labor market records from 1975 to 2017 in Germany.