AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

📅 2024-05-19
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Existing data augmentation methods employ fixed or random augmentation strengths, leading to misalignment between augmented samples and the model’s dynamic training state—thereby exacerbating both underfitting and overfitting. To address this, we propose a reinforcement learning–based, sample-level adaptive augmentation framework. Our approach introduces a hyperparameter-free policy network that dynamically generates optimal augmentation strength for each individual sample in real time. We further design a co-optimization architecture wherein the policy network and the target model are jointly trained end-to-end, with gradient feedback enabling tight alignment between augmentation scheduling and model learning. Evaluated on benchmark datasets (CIFAR-10/100, ImageNet) and diverse backbone architectures (ResNet, ViT), our method consistently outperforms state-of-the-art augmentation techniques, achieving superior generalization while maintaining computational efficiency.

Technology Category

Application Category

📝 Abstract
Data augmentation (DA) is widely employed to improve the generalization performance of deep models. However, most existing DA methods use augmentation operations with random magnitudes throughout training. While this fosters diversity, it can also inevitably introduce uncontrolled variability in augmented data, which may cause misalignment with the evolving training status of the target models. Both theoretical and empirical findings suggest that this misalignment increases the risks of underfitting and overfitting. To address these limitations, we propose AdaAugment, an innovative and tuning-free Adaptive Augmentation method that utilizes reinforcement learning to dynamically adjust augmentation magnitudes for individual training samples based on real-time feedback from the target network. Specifically, AdaAugment features a dual-model architecture consisting of a policy network and a target network, which are jointly optimized to effectively adapt augmentation magnitudes. The policy network optimizes the variability within the augmented data, while the target network utilizes the adaptively augmented samples for training. Extensive experiments across benchmark datasets and deep architectures demonstrate that AdaAugment consistently outperforms other state-of-the-art DA methods in effectiveness while maintaining remarkable efficiency.
Problem

Research questions and friction points this paper is trying to address.

Dynamic adjustment of augmentation magnitudes for better alignment
Reducing risks of underfitting and overfitting in training
Adaptive augmentation without manual tuning for improved performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reinforcement learning for adaptive augmentation
Dual-model architecture optimizes augmentation dynamically
Adjusts magnitudes based on real-time model feedback
🔎 Similar Papers
No similar papers found.
Suorong Yang
Suorong Yang
Nanjing University
Computer VisionDeep LearningMultimodal Learning
Peijia Li
Peijia Li
Nanjing University
Xin Xiong
Xin Xiong
University of Southern California
Image ProcessingComputer VisionVideo compression
S
Shen Furao
School of Artificial Intelligence, Nanjing University, China
J
Jian Zhao
School of Electronic Science and Engineering, Nanjing University, China