Avoiding Structural Failure Modes in Tabular Fair SSL: Online Primal-Dual Allocation under Confidence Gating

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This work addresses a structural conflict in fair semi-supervised learning on tabular data, where confidence-gated pseudo-labeling and fairness regularization often lead to mask collapse and degenerate saturation. The study systematically identifies and diagnoses these failure modes for the first time and introduces an Online Primal-Dual Allocation (OPDA) controller that dynamically adjusts fairness and entropy-stability penalties based on constraint violation, risk, and pseudo-label health signals. This approach enables non-degenerate training without dataset-specific hyperparameter tuning. Evaluated on Adult, ACSIncome, and COMPAS benchmarks, the method effectively mitigates degeneracy, significantly expands the fairness–utility trade-off frontier, and matches or surpasses the performance of static optimal-weight baselines while preserving model utility.

📝 Abstract

Semi-supervised learning (SSL) enables prediction with limited labels, but high-stakes tabular applications (medical, credit, recidivism) require statistical fairness guarantees. We identify a structural conflict in tabular fair SSL through a diagnostic stress test: under confidence-gated pseudo-labeling, moment-matching fairness regularizers can trigger two failure modes -- Masking Collapse (fairness erodes confidence, starving pseudo-labels) and Trivial Saturation (drift to constant predictors). We propose Online Primal-Dual Allocation (OPDA), an online controller that schedules fairness and entropy-based stability penalties using violation, risk, and pseudo-label health signals, avoiding per-dataset selection of a fixed fairness weight within this diagnostic regime. On the evaluated tabular benchmarks (Adult, ACSIncome, COMPAS), OPDA mitigates the degenerate regimes observed under static weighting and simple single-signal adaptive baselines. On Adult and COMPAS, it yields non-degenerate operating points competitive with the empirical static-$λ$ frontier; on ACSIncome, it preserves utility with a wider fairness-utility spread. Relative to OPDA-lite, the full controller mainly shifts the operating point toward higher utility on ACSIncome, while Adult highlights the fairness-utility trade-off between the two variants. These results position OPDA as a calibration-free controller for non-degenerate operating points in tabular fair SSL without per-dataset tuning.

Problem

Research questions and friction points this paper is trying to address.

fair semi-supervised learning

structural failure modes

confidence gating

tabular data

pseudo-labeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Primal-Dual Allocation

Tabular Fair SSL

Confidence Gating