Bi-Level Optimization for Single Domain Generalization

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenging setting of domain generalization with only a single labeled source domain and no access to target-domain data. To tackle this problem, the authors propose BiSDG, a novel framework that introduces bilevel optimization into single-domain generalization for the first time, decoupling task learning from domain modeling. In the upper-level optimization, label-preserving transformations generate proxy domains, while the lower-level employs a lightweight domain prompt encoder to produce feature-wise linear modulation (FiLM) signals that enhance feature representation. The framework further adopts a first-order gradient approximation strategy that avoids second-order derivatives, enabling efficient training. Extensive experiments demonstrate that BiSDG significantly outperforms existing methods across multiple single-domain generalization benchmarks, achieving new state-of-the-art performance.
📝 Abstract
Generalizing from a single labeled source domain to unseen target domains, without access to any target data during training, remains a fundamental challenge in robust machine learning. We address this underexplored setting, known as Single Domain Generalization (SDG), by proposing BiSDG, a bi-level optimization framework that explicitly decouples task learning from domain modeling. BiSDG simulates distribution shifts through surrogate domains constructed via label-preserving transformations of the source data. To capture domain-specific context, we propose a domain prompt encoder that generates lightweight modulation signals to produce augmenting features via feature-wise linear modulation. The learning process is formulated as a bi-level optimization problem: the inner objective optimizes task performance under fixed prompts, while the outer objective maximizes generalization across the surrogate domains by updating the domain prompt encoder. We further develop a practical gradient approximation scheme that enables efficient bi-level training without second-order derivatives. Extensive experiments on various SGD benchmarks demonstrate that BiSDG consistently outperforms prior methods, setting new state-of-the-art performance in the SDG setting.
Problem

Research questions and friction points this paper is trying to address.

Single Domain Generalization
Domain Generalization
Distribution Shift
Bi-Level Optimization
Unseen Target Domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-level Optimization
Single Domain Generalization
Domain Prompt Encoder
Feature-wise Linear Modulation
Surrogate Domains
🔎 Similar Papers
No similar papers found.