Text-Driven Causal Representation Learning for Source-Free Domain Generalization

📅 2025-07-14

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Deep learning models suffer from degraded generalization when training and test distributions differ. Conventional domain generalization (DG) relies on multiple source domains, incurring high data acquisition costs; emerging source-free DG (SFDG) leverages vision-language models (e.g., CLIP) with text prompts to guide visual representation learning, alleviating data dependency but failing to eliminate domain-specific confounding factors. This work pioneers the integration of causal inference into the SFDG framework: we construct a confounder dictionary and design a text-driven causal intervention network to disentangle and strengthen domain-invariant features; further, we enhance style word embeddings and impose causal regularization. Extensive experiments on PACS, VLCS, OfficeHome, and DomainNet demonstrate substantial improvements over state-of-the-art SFDG methods, validating the efficacy of causally grounded representations for cross-domain generalization.

Technology Category

Application Category

📝 Abstract

Deep learning often struggles when training and test data distributions differ. Traditional domain generalization (DG) tackles this by including data from multiple source domains, which is impractical due to expensive data collection and annotation. Recent vision-language models like CLIP enable source-free domain generalization (SFDG) by using text prompts to simulate visual representations, reducing data demands. However, existing SFDG methods struggle with domain-specific confounders, limiting their generalization capabilities. To address this issue, we propose TDCRL ( extbf{T}ext- extbf{D}riven extbf{C}ausal extbf{R}epresentation extbf{L}earning), the first method to integrate causal inference into the SFDG setting. TDCRL operates in two steps: first, it employs data augmentation to generate style word vectors, combining them with class information to generate text embeddings to simulate visual representations; second, it trains a causal intervention network with a confounder dictionary to extract domain-invariant features. Grounded in causal learning, our approach offers a clear and effective mechanism to achieve robust, domain-invariant features, ensuring robust generalization. Extensive experiments on PACS, VLCS, OfficeHome, and DomainNet show state-of-the-art performance, proving TDCRL effectiveness in SFDG.

Problem

Research questions and friction points this paper is trying to address.

Addresses domain-specific confounders in source-free domain generalization

Integrates causal inference to extract domain-invariant features

Reduces data demands using text-driven visual representation simulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses text prompts to simulate visual representations

Integrates causal inference into domain generalization

Trains causal network with confounder dictionary

🔎 Similar Papers

Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction