DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization

📅 2025-04-10

🏛️ Proceedings of the AAAI Conference on Artificial Intelligence

📈 Citations: 1

✨ Influential: 0

career value

214K/year

🤖 AI Summary

The core challenge in vision domain generalization lies in severe domain shifts induced by inter-domain style variations, while semantic content remains stable. To address this, we propose a style-content disentanglement framework built upon selective state space models (SSMs). Our method introduces flow decomposition into SSMs for the first time, constructing a differentiable implicit flow mapping and a probabilistic path alignment mechanism to explicitly separate style and content representations in latent space. By incorporating style-augmented embeddings and flow factorization, we ensure domain-invariant state embeddings. Grounded on VMamba, our approach enables robust zero-shot transfer from source domains to arbitrary unseen target domains. Extensive experiments on standard benchmarks—including PACS and Office-Home—demonstrate state-of-the-art performance, with substantial improvements in cross-domain classification accuracy.

Technology Category

Application Category

📝 Abstract

Domain generalization aims to learn a representation from the source domain, which can be generalized to arbitrary unseen target domains. A fundamental challenge for visual domain generalization is the domain gap caused by the dramatic style variation whereas the image content is stable. The realm of selective state space, exemplified by VMamba, demonstrates its global receptive field in representing the content. However, the way exploiting the domain-invariant property for selective state space is rarely explored. In this paper, we propose a novel Flow Factorized State Space model, dubbed as DGFamba, for visual domain generalization. To maintain domain consistency, we innovatively map the style-augmented and the original state embeddings by flow factorization. In this latent flow space, each state embedding from a certain style is specified by a latent probability path. By aligning these probability paths in the latent space, the state embeddings are able to represent the same content distribution regardless of the style differences. Extensive experiments conducted on various visual domain generalization settings show its state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

Addresses domain gap in visual domain generalization

Explores domain-invariant properties for selective state space

Aligns probability paths to maintain content distribution consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow Factorized State Space model

Map style-augmented embeddings via flow factorization

Align probability paths in latent space

🔎 Similar Papers

No similar papers found.