🤖 AI Summary
To address the poor robustness of deep models trained under concurrent label noise and data heterogeneity, this paper proposes a lightweight, learnable adaptive loss weighting method. It introduces only three learnable parameters to dynamically modulate per-sample loss weights based on sample difficulty, and updates these weights via single-step meta-validation gradients—requiring neither a clean validation set nor manual hyperparameter tuning. Embedded within a meta-learning framework, the method is agnostic to loss functions and network architectures, and operates without strong data augmentation or complex regularization. Extensive experiments across diverse noise settings (symmetric, asymmetric, instance-dependent), benchmarks (CIFAR-10/100, WebVision), and models (ResNet, ViT) demonstrate significant improvements in generalization performance—especially under high noise levels (≥60%). The approach achieves superior efficiency, broad compatibility, and strong robustness.
📝 Abstract
Training deep neural networks in the presence of noisy labels and data heterogeneity is a major challenge. We introduce Lightweight Learnable Adaptive Weighting (LiLAW), a novel method that dynamically adjusts the loss weight of each training sample based on its evolving difficulty level, categorized as easy, moderate, or hard. Using only three learnable parameters, LiLAW adaptively prioritizes informative samples throughout training by updating these weights using a single mini-batch gradient descent step on the validation set after each training mini-batch, without requiring excessive hyperparameter tuning or a clean validation set. Extensive experiments across multiple general and medical imaging datasets, noise levels and types, loss functions, and architectures with and without pretraining demonstrate that LiLAW consistently enhances performance, even in high-noise environments. It is effective without heavy reliance on data augmentation or advanced regularization, highlighting its practicality. It offers a computationally efficient solution to boost model generalization and robustness in any neural network training setup.