🤖 AI Summary
This work investigates the dependence of sample complexity on the hypothesis class size $|H|$ in reproducible PAC learning. By constructing a hard learning instance based on Cayley graphs and leveraging tools from spectral graph theory and random walk analysis, the authors establish the current strongest lower bound of $\Omega((\log|H|)^{3/2})$ on sample complexity. They further design an algorithm whose sample complexity nearly matches this lower bound. This result not only demonstrates the tightness of existing analytical frameworks but also represents the first theoretical breakthrough in reproducible learning where upper and lower bounds are nearly matched.
📝 Abstract
In this paper, we consider the problem of replicable realizable PAC learning. We construct a particularly hard learning problem and show a sample complexity lower bound with a close to $(\log|H|)^{3/2}$ dependence on the size of the hypothesis class $H$. Our proof uses several novel techniques and works by defining a particular Cayley graph associated with $H$ and analyzing a suitable random walk on this graph by examining the spectral properties of its adjacency matrix.
Furthermore, we show an almost matching upper bound for the lower bound instance, meaning if a stronger lower bound exists, one would have to consider a different instance of the problem.