Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Robust learning under label noise remains challenging due to unreliable supervision signals. Method: This paper proposes a reconstruction loss based on information-divergence-constrained neighborhood relaxation, enabling efficient adversarial training via convex duality. Crucially, it unifies model “forgetting” with distributional robustness—without auxiliary modules—and seamlessly integrates into standard cross-entropy pipelines while preserving strong theoretical interpretability. Contribution/Results: The method defines noisy-sample neighborhoods via information divergence, adaptively down-weighting their gradient contributions; it further grounds forgetting behavior in distributionally robust optimization. Extensive experiments demonstrate consistent superiority over state-of-the-art robust losses across symmetric, asymmetric, synthetic, and real-world noisy benchmarks. Training efficiency approaches that of standard cross-entropy, while generalization performance and training stability are markedly improved.

Technology Category

Application Category

📝 Abstract
We introduce ANTIDOTE, a new class of objectives for learning under noisy labels which are defined in terms of a relaxation over an information-divergence neighborhood. Using convex duality, we provide a reformulation as an adversarial training method that has similar computational cost to training with standard cross-entropy loss. We show that our approach adaptively reduces the influence of the samples with noisy labels during learning, exhibiting a behavior that is analogous to forgetting those samples. ANTIDOTE is effective in practical environments where label noise is inherent in the training data or where an adversary can alter the training labels. Extensive empirical evaluations on different levels of symmetric, asymmetric, human annotation, and real-world label noise show that ANTIDOTE outperforms leading comparable losses in the field and enjoys a time complexity that is very close to that of the standard cross entropy loss.
Problem

Research questions and friction points this paper is trying to address.

Learning robust models under noisy label conditions
Reducing influence of mislabeled samples adaptively
Handling symmetric, asymmetric and real-world label noise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial training with divergence-based objectives
Adaptively reduces noisy label influence
Similar computational cost to cross-entropy
🔎 Similar Papers
No similar papers found.