Nearly Optimal Sample Complexity for Learning with Label Proportions

📅 2025-05-08

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This paper studies the sample complexity of label proportion learning (LLP) under squared loss: training data are provided in “bags,” with only the average label per bag observable, and the goal is low-regret prediction at the individual-sample level. Addressing the limitations of existing methods—namely, sample complexity dependence on bag size and loose theoretical bounds—we propose an improved empirical risk minimization framework coupled with a variance-reduced stochastic gradient descent algorithm. Our approach achieves, for the first time, near-optimal sample complexity: its upper bound is independent of bag size and matches information-theoretic lower bounds in dependence on feature dimensionality and hypothesis class complexity. We prove the tightness of this bound theoretically. Experiments on multiple benchmark datasets demonstrate that our method significantly outperforms recent baselines using fewer total samples, empirically validating the practical efficacy of our theoretical advances.

Technology Category

Application Category

📝 Abstract

We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.

Problem

Research questions and friction points this paper is trying to address.

Study sample complexity for Learning with Label Proportions (LLP)

Develop near-optimal algorithms for LLP under square loss

Validate improved accuracy with fewer samples empirically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical Risk Minimization variants for LLP

Stochastic Gradient Descent with variance reduction

Nearly optimal sample complexity analysis

🔎 Similar Papers

Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance