🤖 AI Summary
This work challenges the conventional view of distribution shift as a hindrance to learning by introducing the Positive Distribution Shift (PDS) framework, which leverages distributional changes to enhance computational tractability. Rather than altering the learning algorithm itself, PDS strategically designs the training distribution \(D'(x)\) so that function classes previously intractable under standard gradient-based optimization become efficiently learnable. The paper formalizes several variants of PDS, integrating concepts from covariate shift modeling, learnability theory, and membership query learning to demonstrate that certain computationally hard problems can be effectively solved under appropriately shifted training distributions. These results underscore the pivotal role of training distribution design in modern machine learning, positioning it as a powerful tool for overcoming inherent computational barriers.
📝 Abstract
We study a setting where the goal is to learn a target function f(x) with respect to a target distribution D(x), but training is done on i.i.d. samples from a different training distribution D'(x), labeled by the true target f(x). Such a distribution shift (here in the form of covariate shift) is usually viewed negatively, as hurting or making learning harder, and the traditional distribution shift literature is mostly concerned with limiting or avoiding this negative effect. In contrast, we argue that with a well-chosen D'(x), the shift can be positive and make learning easier -- a perspective called Positive Distribution Shift (PDS). Such a perspective is central to contemporary machine learning, where much of the innovation is in finding good training distributions D'(x), rather than changing the training algorithm. We further argue that the benefit is often computational rather than statistical, and that PDS allows computationally hard problems to become tractable even using standard gradient-based training. We formalize different variants of PDS, show how certain hard classes are easily learnable under PDS, and make connections with membership query learning.