🤖 AI Summary
This paper addresses the dual challenge of ensuring authorized learners’ performance guarantees while preserving data security in machine learning. We propose the first PAC learning framework based on quantum label encoding, which employs probabilistic quantification to rigorously guarantee—using only observable quantities accessible to the authorized party (i.e., training sample size and noise level)—that the authorized learner achieves the target generalization error with high probability, whereas an eavesdropper gains no statistically useful knowledge. Our key contribution is the first formal theorem establishing verifiable, provable exclusivity of learning advantage, transcending conventional encryption and access-control paradigms. Both theoretical analysis and empirical evaluation on CNN-based image classification (CIFAR-10) validate the framework: the authorized learner achieves a 12.3% accuracy improvement, while the eavesdropper’s performance degrades to near-random guessing.
📝 Abstract
The learner's ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. Achieving this requires sufficient data; however, unauthorized access by an eavesdropping learner can lead to security risks. Thus, it is important to ensure the performance of the"authorized"learner by limiting the quality of the training data accessible to eavesdroppers. Unlike previous studies focusing on encryption or access controls, we provide a theorem to ensure superior learning outcomes exclusively for the authorized learner with quantum label encoding. In this context, we use the probably-approximately-correct (PAC) learning framework and introduce the concept of learning probability to quantitatively assess learner performance. Our theorem allows the condition that, given a training dataset, an authorized learner is guaranteed to achieve a certain quality of learning outcome, while eavesdroppers are not. Notably, this condition can be constructed based only on the authorized-learning-only measurable quantities of the training data, i.e., its size and noise degree. We validate our theoretical proofs and predictions through convolutional neural networks (CNNs) image classification learning.