Approximate Replicability in Learning

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Strict reproducibility is unattainable in PAC learning—for instance, no reproducible algorithm exists for threshold learning. Method: This paper introduces three natural relaxations—pointwise, approximate, and semi-reproducibility—and develops corresponding learning algorithms leveraging shared randomness and unlabeled samples, grounded in distributional consistency analysis and statistical learning theory. Results: We establish, for the first time, that pointwise and approximate reproducibility are achievable “for free” under constant parameters, yielding sample-optimal adversarial PAC learning with sample complexity Θ(d/α²). Semi-reproducibility is attained using Θ(d²/α²) labeled examples while ensuring full reproducibility. These results extend the theoretical foundations of stability-based learning and formally demonstrate both the feasibility and efficiency of approximate reproducible learning.

Technology Category

Application Category

📝 Abstract

Replicability, introduced by (Impagliazzo et al. STOC '22), is the notion that algorithms should remain stable under a resampling of their inputs (given access to shared randomness). While a strong and interesting notion of stability, the cost of replicability can be prohibitive: there is no replicable algorithm, for instance, for tasks as simple as threshold learning (Bun et al. STOC '23). Given such strong impossibility results we ask: under what approximate notions of replicability is learning possible? In this work, we propose three natural relaxations of replicability in the context of PAC learning: (1) Pointwise: the learner must be consistent on any fixed input, but not across all inputs simultaneously, (2) Approximate: the learner must output hypotheses that classify most of the distribution consistently, (3) Semi: the algorithm is fully replicable, but may additionally use shared unlabeled samples. In all three cases, for constant replicability parameters, we obtain sample-optimal agnostic PAC learners: (1) and (2) are achievable for ``free" using $Θ(d/α^2)$ samples, while (3) requires $Θ(d^2/α^2)$ labeled samples.

Problem

Research questions and friction points this paper is trying to address.

Exploring three relaxations of replicability for PAC learning

Developing sample-optimal agnostic learners for approximate replicability

Addressing prohibitive costs of strict replicability in learning algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pointwise replicability ensures consistency on fixed inputs

Approximate replicability classifies most distribution consistently

Semi replicability uses shared unlabeled samples with full replicability

🔎 Similar Papers

Generalizability of experimental studies

2024-06-25arXiv.orgCitations: 0

What is Reproducibility in Artificial Intelligence and Machine Learning Research?

2024-04-29arXiv.orgCitations: 1

Bosch Group

Elchingen, BY, DE

Authors to Follow