Limitations of Membership Queries in Testable Learning

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This paper investigates whether membership queries (MQs) can yield superpolynomial time speedups in the testable learning model. Method: We construct a generic reduction from sample-based refutation of Boolean concept classes to query-augmented testable learning, showing that any $m$-sample testable learner cannot outperform an $m$-sample PAC learner in polynomial time complexity. We further introduce the “statistical MQ algorithm” framework—encompassing prevalent MQ paradigms—and combine it with lower bounds on the statistical query (SQ) dimension, subcubic query conditions, and distribution-specific learnability analysis. Contribution/Results: We establish, for the first time, a fundamental limitation of MQs in testable learning: they cannot provide superpolynomial acceleration. Moreover, any effective MQ-based testable learner must inherently rely on structural properties of the underlying distribution to output hypotheses that are approximately optimal.

Technology Category

Application Category

📝 Abstract

Membership queries (MQ) often yield speedups for learning tasks, particularly in the distribution-specific setting. We show that in the emph{testable learning} model of Rubinfeld and Vasilyan [RV23], membership queries cannot decrease the time complexity of testable learning algorithms beyond the complexity of sample-only distribution-specific learning. In the testable learning model, the learner must output a hypothesis whenever the data distribution satisfies a desired property, and if it outputs a hypothesis, the hypothesis must be near-optimal. We give a general reduction from sample-based emph{refutation} of boolean concept classes, as presented in [Vadhan17, KL18], to testable learning with queries (TL-Q). This yields lower bounds for TL-Q via the reduction from learning to refutation given in [KL18]. The result is that, relative to a concept class and a distribution family, no $m$-sample TL-Q algorithm can be super-polynomially more time-efficient than the best $m$-sample PAC learner. Finally, we define a class of ``statistical''MQ algorithms that encompasses many known distribution-specific MQ learners, such as those based on influence estimation or subcube-conditional statistical queries. We show that TL-Q algorithms in this class imply efficient statistical-query refutation and learning algorithms. Thus, combined with known SQ dimension lower bounds, our results imply that these efficient membership query learners cannot be made testable.

Problem

Research questions and friction points this paper is trying to address.

Membership queries do not reduce time complexity in testable learning.

Testable learning with queries is not super-polynomially faster than PAC learning.

Efficient statistical membership query learners cannot be made testable.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reduces testable learning with queries to sample-based refutation

Shows membership queries cannot significantly improve time complexity

Defines statistical MQ algorithms implying efficient SQ refutation

🔎 Similar Papers

Fundamental Limits of Membership Inference Attacks on Machine Learning Models