Learning to Forget with Information Divergence Reweighted Objectives for Noisy Labels

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Robust learning under label noise remains challenging due to unreliable supervision signals. Method: This paper proposes a reconstruction loss based on information-divergence-constrained neighborhood relaxation, enabling efficient adversarial training via convex duality. Crucially, it unifies model “forgetting” with distributional robustness—without auxiliary modules—and seamlessly integrates into standard cross-entropy pipelines while preserving strong theoretical interpretability. Contribution/Results: The method defines noisy-sample neighborhoods via information divergence, adaptively down-weighting their gradient contributions; it further grounds forgetting behavior in distributionally robust optimization. Extensive experiments demonstrate consistent superiority over state-of-the-art robust losses across symmetric, asymmetric, synthetic, and real-world noisy benchmarks. Training efficiency approaches that of standard cross-entropy, while generalization performance and training stability are markedly improved.

Technology Category

Application Category

📝 Abstract

We introduce ANTIDOTE, a new class of objectives for learning under noisy labels which are defined in terms of a relaxation over an information-divergence neighborhood. Using convex duality, we provide a reformulation as an adversarial training method that has similar computational cost to training with standard cross-entropy loss. We show that our approach adaptively reduces the influence of the samples with noisy labels during learning, exhibiting a behavior that is analogous to forgetting those samples. ANTIDOTE is effective in practical environments where label noise is inherent in the training data or where an adversary can alter the training labels. Extensive empirical evaluations on different levels of symmetric, asymmetric, human annotation, and real-world label noise show that ANTIDOTE outperforms leading comparable losses in the field and enjoys a time complexity that is very close to that of the standard cross entropy loss.

Problem

Research questions and friction points this paper is trying to address.

Learning robust models under noisy label conditions

Reducing influence of mislabeled samples adaptively

Handling symmetric, asymmetric and real-world label noise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial training with divergence-based objectives

Adaptively reduces noisy label influence

Similar computational cost to cross-entropy

🔎 Similar Papers

Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models