DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization

📅 2025-04-10
🏛️ Proceedings of the AAAI Conference on Artificial Intelligence
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
The core challenge in vision domain generalization lies in severe domain shifts induced by inter-domain style variations, while semantic content remains stable. To address this, we propose a style-content disentanglement framework built upon selective state space models (SSMs). Our method introduces flow decomposition into SSMs for the first time, constructing a differentiable implicit flow mapping and a probabilistic path alignment mechanism to explicitly separate style and content representations in latent space. By incorporating style-augmented embeddings and flow factorization, we ensure domain-invariant state embeddings. Grounded on VMamba, our approach enables robust zero-shot transfer from source domains to arbitrary unseen target domains. Extensive experiments on standard benchmarks—including PACS and Office-Home—demonstrate state-of-the-art performance, with substantial improvements in cross-domain classification accuracy.

Technology Category

Application Category

📝 Abstract
Domain generalization aims to learn a representation from the source domain, which can be generalized to arbitrary unseen target domains. A fundamental challenge for visual domain generalization is the domain gap caused by the dramatic style variation whereas the image content is stable. The realm of selective state space, exemplified by VMamba, demonstrates its global receptive field in representing the content. However, the way exploiting the domain-invariant property for selective state space is rarely explored. In this paper, we propose a novel Flow Factorized State Space model, dubbed as DGFamba, for visual domain generalization. To maintain domain consistency, we innovatively map the style-augmented and the original state embeddings by flow factorization. In this latent flow space, each state embedding from a certain style is specified by a latent probability path. By aligning these probability paths in the latent space, the state embeddings are able to represent the same content distribution regardless of the style differences. Extensive experiments conducted on various visual domain generalization settings show its state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Addresses domain gap in visual domain generalization
Explores domain-invariant properties for selective state space
Aligns probability paths to maintain content distribution consistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow Factorized State Space model
Map style-augmented embeddings via flow factorization
Align probability paths in latent space
🔎 Similar Papers
No similar papers found.
Q
Qi Bi
Jarvis Research Center, Tencent YouTu Lab, ShenZhen, China
J
Jingjun Yi
Jarvis Research Center, Tencent YouTu Lab, ShenZhen, China
H
Hao Zheng
Jarvis Research Center, Tencent YouTu Lab, ShenZhen, China
Haolan Zhan
Haolan Zhan
Monash University
Natural Language ProcessingDialogue SystemsResponsible AI
W
Wei Ji
School of Medicine, Yale University, New Haven, United States
Y
Yawen Huang
Jarvis Research Center, Tencent YouTu Lab, ShenZhen, China
Y
Yuexiang Li
Faculty of Science and Technology, University of Macau, Macau