π€ AI Summary
This work addresses the problem of efficiently learning halfspace classifiers in the presence of perturbed contrastive samples. It proposes a two-step contrastive mechanism wherein each sample is paired with a contrastive counterpart of opposite label, and the magnitude of perturbation is determined by a non-decreasing noise function based on the sampleβs distance to the decision boundary. This formulation more realistically captures practical contrastive learning scenarios. The paper provides the first analysis of both active and passive sample complexity for halfspaces under this framework. Under the assumption of a uniform data distribution, it establishes that, for both fixed and random perturbations, contrastive samples can significantly reduce asymptotic and expected query complexity under certain noise conditions, thereby enhancing learning efficiency.
π Abstract
We study learning under a two-step contrastive example oracle, as introduced by Mansouri et. al. (2025), where each queried (or sampled) labeled example is paired with an additional contrastive example of opposite label. While Mansouri et al. assume an idealized setting, where the contrastive example is at minimum distance of the originally queried/sampled point, we introduce and analyze a mechanism, parameterized by a non-decreasing noise function $f$, under which this ideal contrastive example is perturbed. The amount of perturbation is controlled by $f(d)$, where $d$ is the distance of the queried/sampled point to the decision boundary. Intuitively, this results in higher-quality contrastive examples for points closer to the decision boundary. We study this model in two settings: (i) when the maximum perturbation magnitude is fixed, and (ii) when it is stochastic. For one-dimensional thresholds and for half-spaces under the uniform distribution on a bounded domain, we characterize active and passive contrastive sample complexity in dependence on the function $f$. We show that, under certain conditions on $f$, the presence of contrastive examples speeds up learning in terms of asymptotic query complexity and asymptotic expected query complexity.