AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

📅 2024-05-19

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing data augmentation methods employ fixed or random augmentation strengths, leading to misalignment between augmented samples and the model’s dynamic training state—thereby exacerbating both underfitting and overfitting. To address this, we propose a reinforcement learning–based, sample-level adaptive augmentation framework. Our approach introduces a hyperparameter-free policy network that dynamically generates optimal augmentation strength for each individual sample in real time. We further design a co-optimization architecture wherein the policy network and the target model are jointly trained end-to-end, with gradient feedback enabling tight alignment between augmentation scheduling and model learning. Evaluated on benchmark datasets (CIFAR-10/100, ImageNet) and diverse backbone architectures (ResNet, ViT), our method consistently outperforms state-of-the-art augmentation techniques, achieving superior generalization while maintaining computational efficiency.

Technology Category

Application Category

📝 Abstract

Data augmentation (DA) is widely employed to improve the generalization performance of deep models. However, most existing DA methods use augmentation operations with random magnitudes throughout training. While this fosters diversity, it can also inevitably introduce uncontrolled variability in augmented data, which may cause misalignment with the evolving training status of the target models. Both theoretical and empirical findings suggest that this misalignment increases the risks of underfitting and overfitting. To address these limitations, we propose AdaAugment, an innovative and tuning-free Adaptive Augmentation method that utilizes reinforcement learning to dynamically adjust augmentation magnitudes for individual training samples based on real-time feedback from the target network. Specifically, AdaAugment features a dual-model architecture consisting of a policy network and a target network, which are jointly optimized to effectively adapt augmentation magnitudes. The policy network optimizes the variability within the augmented data, while the target network utilizes the adaptively augmented samples for training. Extensive experiments across benchmark datasets and deep architectures demonstrate that AdaAugment consistently outperforms other state-of-the-art DA methods in effectiveness while maintaining remarkable efficiency.

Problem

Research questions and friction points this paper is trying to address.

Dynamic adjustment of augmentation magnitudes for better alignment

Reducing risks of underfitting and overfitting in training

Adaptive augmentation without manual tuning for improved performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reinforcement learning for adaptive augmentation

Dual-model architecture optimizes augmentation dynamically

Adjusts magnitudes based on real-time model feedback

🔎 Similar Papers

Data augmentation with automated machine learning: approaches and performance comparison with classical data augmentation methods