Optimal Learning from Label Proportions with General Loss Functions

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses label proportion learning (LPL), a weakly supervised learning paradigm where only the average label proportions over grouped instances (bags) are observed, and the goal is to train an individual-instance classifier. Existing LPL methods suffer from limited loss-function compatibility and suboptimal bias–variance trade-offs. To address these issues, we propose a novel low-variance debiased gradient estimator. Our estimator is the first to uniformly support diverse standard losses—including cross-entropy and hinge loss—by jointly leveraging label-proportion estimation and gradient correction. This design yields stronger theoretical guarantees, notably an improved sample complexity bound, and enhanced empirical performance. Extensive experiments across multiple benchmark datasets demonstrate that our method consistently outperforms state-of-the-art baselines on both binary and multiclass classification tasks.

Technology Category

Application Category

📝 Abstract
Motivated by problems in online advertising, we address the task of Learning from Label Proportions (LLP). In this partially-supervised setting, training data consists of groups of examples, termed bags, for which we only observe the average label value. The main goal, however, remains the design of a predictor for the labels of individual examples. We introduce a novel and versatile low-variance de-biasing methodology to learn from aggregate label information, significantly advancing the state of the art in LLP. Our approach exhibits remarkable flexibility, seamlessly accommodating a broad spectrum of practically relevant loss functions across both binary and multi-class classification settings. By carefully combining our estimators with standard techniques, we substantially improve sample complexity guarantees for a large class of losses of practical relevance. We also empirically validate the efficacy of our proposed approach across a diverse array of benchmark datasets, demonstrating compelling empirical advantages over standard baselines.
Problem

Research questions and friction points this paper is trying to address.

Learning from aggregate label proportions instead of individual labels
Designing predictors for individual examples from group averages
Accommodating various loss functions in binary and multi-class classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-variance de-biasing methodology for aggregate labels
Flexible accommodation of broad loss functions
Improved sample complexity guarantees via estimators
🔎 Similar Papers
No similar papers found.