D-GAP: Improving Out-of-Domain Robustness via Dataset-Agnostic and Gradient-Guided Augmentation in Amplitude and Pixel Spaces

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

In real-world scenarios, variations in image backgrounds, styles, and acquisition devices severely degrade model out-of-distribution (OOD) generalization. Conventional data augmentation lacks robustness, while dataset-specific augmentation relies heavily on expert priors. Existing methods struggle to jointly optimize frequency-domain adaptability and pixel-level detail preservation. To address this, we propose D-GAP—a gradient-guided adaptive frequency-spatial joint augmentation framework. D-GAP is the first method to generate frequency-sensitive maps directly from task gradients, enabling prior-free frequency-magnitude interpolation and pixel-wise mixing in a synergistically optimized manner. This effectively mitigates model overfitting to domain-specific frequency components. Extensive experiments on four real-world datasets and three OOD benchmarks demonstrate consistent improvements: +5.3% and +1.8% average accuracy gains over generic and dataset-customized augmentation baselines, respectively.

Technology Category

Application Category

📝 Abstract

Out-of-domain (OOD) robustness is challenging to achieve in real-world computer vision applications, where shifts in image background, style, and acquisition instruments always degrade model performance. Generic augmentations show inconsistent gains under such shifts, whereas dataset-specific augmentations require expert knowledge and prior analysis. Moreover, prior studies show that neural networks adapt poorly to domain shifts because they exhibit a learning bias to domain-specific frequency components. Perturbing frequency values can mitigate such bias but overlooks pixel-level details, leading to suboptimal performance. To address these problems, we propose D-GAP (Dataset-agnostic and Gradient-guided augmentation in Amplitude and Pixel spaces), improving OOD robustness by introducing targeted augmentation in both the amplitude space (frequency space) and pixel space. Unlike conventional handcrafted augmentations, D-GAP computes sensitivity maps in the frequency space from task gradients, which reflect how strongly the model responds to different frequency components, and uses the maps to adaptively interpolate amplitudes between source and target samples. This way, D-GAP reduces the learning bias in frequency space, while a complementary pixel-space blending procedure restores fine spatial details. Extensive experiments on four real-world datasets and three domain-adaptation benchmarks show that D-GAP consistently outperforms both generic and dataset-specific augmentations, improving average OOD performance by +5.3% on real-world datasets and +1.8% on benchmark datasets.

Problem

Research questions and friction points this paper is trying to address.

Addressing model performance degradation from domain shifts in computer vision

Overcoming learning bias toward domain-specific frequency components in neural networks

Improving out-of-domain robustness through dual-space amplitude and pixel augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Augments images in both amplitude and pixel spaces

Uses gradient-guided sensitivity maps for frequency interpolation

Combines frequency bias reduction with spatial detail restoration

🔎 Similar Papers

Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness