$ε$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Label noise severely degrades the generalization performance of deep neural networks, while existing symmetric loss methods often suffer from underfitting due to overly restrictive symmetry constraints. To address this, we propose ε-softmax: a simple yet effective modification to standard softmax that introduces a controllable error margin ε, enabling outputs to approximate—but not strictly equal—hard one-hot labels. This implicitly regularizes arbitrary loss functions, yielding tunable noise-robust learning. We theoretically establish that the excess risk bound decays controllably with ε. Empirically, ε-softmax synergistically enhances symmetric losses, improving both robustness and fitting capacity. Extensive experiments on synthetic and real-world noisy benchmarks—including CIFAR-10/100-N and WebVision—demonstrate consistent and significant improvements over state-of-the-art methods. Our implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Noisy labels pose a common challenge for training accurate deep neural networks. To mitigate label noise, prior studies have proposed various robust loss functions to achieve noise tolerance in the presence of label noise, particularly symmetric losses. However, they usually suffer from the underfitting issue due to the overly strict symmetric condition. In this work, we propose a simple yet effective approach for relaxing the symmetric condition, namely $ε$-softmax, which simply modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error $ε$. Essentially, $ε$-softmax not only acts as an alternative for the softmax layer, but also implicitly plays the crucial role in modifying the loss function. We prove theoretically that $ε$-softmax can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function. Recognizing that $ε$-softmax-enhanced losses may slightly reduce fitting ability on clean datasets, we further incorporate them with one symmetric loss, thereby achieving a better trade-off between robustness and effective learning. Extensive experiments demonstrate the superiority of our method in mitigating synthetic and real-world label noise. The code is available at https://github.com/cswjl/eps-softmax.
Problem

Research questions and friction points this paper is trying to address.

Mitigating label noise in deep neural networks training
Relaxing symmetric condition for robust loss functions
Balancing robustness and learning with ε-softmax
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modifies softmax outputs to approximate one-hot vectors
Achieves noise-tolerant learning with controllable risk
Combines with symmetric loss for robustness and learning
🔎 Similar Papers
No similar papers found.
Jialiang Wang
Jialiang Wang
Research Scientist, Meta AI
Computer VisionGenerative AI
Xiong Zhou
Xiong Zhou
Applied Scientist, Amazon
Computer visionmachine learning
D
Deming Zhai
Faculty of Computing, Harbin Institute of Technology
Junjun Jiang
Junjun Jiang
Harbin Institute of Technology
Image ProcessingComputer VisionMachine Learning
X
Xiangyang Ji
Department of Automation, Tsinghua University
X
Xianming Liu
Faculty of Computing, Harbin Institute of Technology