Domain Generalization via Pareto Optimal Gradient Matching

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Gradient-based domain generalization suffers from inconsistent and unstable inter-domain gradient directions, along with high computational overhead from second-order derivative approximations. To address these issues, this paper proposes Pareto-Optimal Gradient Matching (POGM), which models gradient trajectories as learnable signals and jointly optimizes them within a meta-learning framework: (i) maximizing inter-domain gradient inner products to enforce directional consistency, and (ii) constraining gradients to remain aligned with the empirical risk minimization direction to suppress oscillation. POGM employs first-order meta-updates to efficiently locate Pareto-optimal solutions, eliminating the need for costly second-order approximations. Evaluated on DomainBed, POGM achieves competitive generalization performance relative to state-of-the-art methods while reducing training cost by over 30%, demonstrating both effectiveness and practical efficiency.

Technology Category

Application Category

📝 Abstract
In this study, we address the gradient-based domain generalization problem, where predictors aim for consistent gradient directions across different domains. Existing methods have two main challenges. First, minimization of gradient empirical distance or gradient inner products (GIP) leads to gradient fluctuations among domains, thereby hindering straightforward learning. Second, the direct application of gradient learning to the joint loss function can incur high computation overheads due to second-order derivative approximation. To tackle these challenges, we propose a new Pareto Optimality Gradient Matching (POGM) method. In contrast to existing methods that add gradient matching as regularization, we leverage gradient trajectories as collected data and apply independent training at the meta-learner. In the meta-update, we maximize GIP while limiting the learned gradient from deviating too far from the empirical risk minimization gradient trajectory. By doing so, the aggregate gradient can incorporate knowledge from all domains without suffering gradient fluctuation towards any particular domain. Experimental evaluations on datasets from DomainBed demonstrate competitive results yielded by POGM against other baselines while achieving computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Address gradient fluctuations across domains in generalization
Reduce high computation costs in gradient learning
Improve domain generalization via Pareto optimal gradient matching
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pareto Optimal Gradient Matching for domain generalization
Meta-learner training with gradient trajectories
Maximize GIP while limiting gradient deviation
🔎 Similar Papers
No similar papers found.
Khoi Do
Khoi Do
PhD Student at Trinity College Dublin
Computer VisionGenerative AI3D Generation3D Deep Learning
D
Duong Nguyen
College of Engineering & Computer Science, Vin University, Hanoi, Vietnam
N
Nam-Khanh Le
SoICT, Hanoi University of Science and Technology, Hanoi, Vietnam
Q
Quoc-Viet Pham
School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Binh-Son Hua
Binh-Son Hua
Trinity College Dublin
Generative 3D AI3D Deep LearningComputer VisionComputer GraphicsRendering
W
Won-Joo Hwang
Pusan National University, Busan, South Korea