Meta Curvature-Aware Minimization for Domain Generalization

📅 2024-12-16
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited generalization performance of domain generalization (DG) models on unseen target domains, this paper proposes a curvature-aware joint optimization framework designed to guide models toward flatter minima of the loss landscape. Methodologically, it introduces (1) a curvature-adaptive metric that selectively amplifies curvature awareness only near local minima, and (2) the first integration of Sharpness-Aware Minimization (SAM) with a meta-learning-based proxy gap, yielding a unified multi-objective optimization objective. Theoretical analysis provides bounds on generalization error and convergence rate. Evaluated on five standard DG benchmarks—PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet—the method consistently outperforms existing state-of-the-art approaches, achieving significant improvements in cross-domain generalization performance.

Technology Category

Application Category

📝 Abstract
Domain generalization (DG) aims to enhance the ability of models trained on source domains to generalize effectively to unseen domains. Recently, Sharpness-Aware Minimization (SAM) has shown promise in this area by reducing the sharpness of the loss landscape to obtain more generalized models. However, SAM and its variants sometimes fail to guide the model toward a flat minimum, and their training processes exhibit limitations, hindering further improvements in model generalization. In this paper, we first propose an improved model training process aimed at encouraging the model to converge to a flat minima. To achieve this, we design a curvature metric that has a minimal effect when the model is far from convergence but becomes increasingly influential in indicating the curvature of the minima as the model approaches a local minimum. Then we derive a novel algorithm from this metric, called Meta Curvature-Aware Minimization (MeCAM), to minimize the curvature around the local minima. Specifically, the optimization objective of MeCAM simultaneously minimizes the regular training loss, the surrogate gap of SAM, and the surrogate gap of meta-learning. We provide theoretical analysis on MeCAM's generalization error and convergence rate, and demonstrate its superiority over existing DG methods through extensive experiments on five benchmark DG datasets, including PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet. Code will be available on GitHub.
Problem

Research questions and friction points this paper is trying to address.

Improves model generalization to unseen domains
Proposes curvature-aware minimization for flat minima
Introduces MeCAM algorithm for better domain generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Improved model training for flat minima convergence
Curvature metric for local minima optimization
Meta Curvature-Aware Minimization (MeCAM) algorithm
🔎 Similar Papers
No similar papers found.
Ziyang Chen
Ziyang Chen
Peking University
Quantum key distributionQuantum random number generation
Y
Yiwen Ye
School of Computer Science and Engineering, Northwestern Polytechnical University, China
F
Feilong Tang
Faculty of Engineering, Monash University, Australia
Yongsheng Pan
Yongsheng Pan
Northwestern Polytechnical University
Y
Yong Xia
School of Computer Science and Engineering, Northwestern Polytechnical University, China; Research & Development Institute of Northwestern Polytechnical University in Shenzhen, China; Ningbo Institute of Northwestern Polytechnical University, China