Graph-based Semi-Supervised Learning via Maximum Discrimination

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of conventional graph-based semi-supervised learning methods in scenarios with scarce labeled data but abundant unlabeled data, particularly under complex label distributions. The authors propose AUC-spec, a novel approach that, for the first time, integrates AUC maximization into the graph semi-supervised learning framework. By jointly optimizing inter-class separability and graph smoothness through low-dimensional embeddings, the method achieves a principled balance between discriminative power and structural consistency. Leveraging a manifold product model, the theoretical analysis establishes a polynomial dependence of the required labeled sample size on key problem parameters. Extensive experiments on both synthetic and real-world datasets demonstrate that AUC-spec delivers competitive classification performance while maintaining computational efficiency, effectively reconciling class separation with graph-structure fidelity.

Technology Category

Application Category

📝 Abstract
Semi-supervised learning (SSL) addresses the critical challenge of training accurate models when labeled data is scarce but unlabeled data is abundant. Graph-based SSL (GSSL) has emerged as a popular framework that captures data structure through graph representations. Classic graph SSL methods, such as Label Propagation and Label Spreading, aim to compute low-dimensional representations where points with the same labels are close in representation space. Although often effective, these methods can be suboptimal on data with complex label distributions. In our work, we develop AUC-spec, a graph approach that computes a low-dimensional representation that maximizes class separation. We compute this representation by optimizing the Area Under the ROC Curve (AUC) as estimated via the labeled points. We provide a detailed analysis of our approach under a product-of-manifold model, and show that the required number of labeled points for AUC-spec is polynomial in the model parameters. Empirically, we show that AUC-spec balances class separation with graph smoothness. It demonstrates competitive results on synthetic and real-world datasets while maintaining computational efficiency comparable to the field's classic and state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Semi-Supervised Learning
Graph-based Learning
Class Separation
Label Scarcity
Complex Label Distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based Semi-Supervised Learning
AUC Optimization
Maximum Discrimination
Label Efficiency
Manifold Model
🔎 Similar Papers
No similar papers found.
N
Nadav Katz
Department of Statistics and Data Science, The Hebrew University of Jerusalem
Ariel Jaffe
Ariel Jaffe
The Hebrew University
statisticsmachine learningcomputational biology