🤖 AI Summary
This paper investigates the fundamental trade-off between computational efficiency and statistical sample complexity in PAC learning: under standard worst-case complexity assumptions, does efficient computability necessarily entail higher sample requirements? Breaking through the theoretical barriers established by Applebaum et al., it provides the first computational–statistical trade-off for improper learning based on NP-hardness. The approach integrates average-case complexity analysis, VC-dimension theory, and NP-hardness reductions to construct a learning model over subclasses of polynomial-size circuits. Key results show that, if NP requires exponential time, there exist function classes with VC-dimension one for which every efficient algorithm requires arbitrarily large sample complexity. Furthermore, it proves that all NP-enumerable function classes are efficiently learnable if and only if RP = NP. This work thus establishes, for the first time, a classical complexity-theoretic conjecture—RP = NP—as a precise feasibility criterion for efficient learning.
📝 Abstract
A central question in computer science and statistics is whether efficient algorithms can achieve the information-theoretic limits of statistical problems. Many computational-statistical tradeoffs have been shown under average-case assumptions, but since statistical problems are average-case in nature, it has been a challenge to base them on standard worst-case assumptions.
In PAC learning where such tradeoffs were first studied, the question is whether computational efficiency can come at the cost of using more samples than information-theoretically necessary. We base such tradeoffs on $mathsf{NP}$-hardness and obtain:
$circ$ Sharp computational-statistical tradeoffs assuming $mathsf{NP}$ requires exponential time: For every polynomial $p(n)$, there is an $n$-variate class $C$ with VC dimension $1$ such that the sample complexity of time-efficiently learning $C$ is $Θ(p(n))$.
$circ$ A characterization of $mathsf{RP}$ vs. $mathsf{NP}$ in terms of learning: $mathsf{RP} = mathsf{NP}$ iff every $mathsf{NP}$-enumerable class is learnable with $O(mathrm{VCdim}(C))$ samples in polynomial time. The forward implication has been known since (Pitt and Valiant, 1988); we prove the reverse implication.
Notably, all our lower bounds hold against improper learners. These are the first $mathsf{NP}$-hardness results for improperly learning a subclass of polynomial-size circuits, circumventing formal barriers of Applebaum, Barak, and Xiao (2008).