Learning domain-invariant features through channel-level sparsification for Out-Of Distribution Generalization

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of deep models to spurious, domain-specific features under out-of-distribution (OOD) scenarios, which undermines generalization. To mitigate this, the authors propose a channel-wise causal masking strategy combined with a hierarchical causal Dropout mechanism to explicitly disentangle causal and non-causal features at the representation level. Furthermore, they introduce a matrix-based mutual information optimization objective and enhance the VICReg regularizer with StyleMix augmentation to effectively decouple semantic content from stylistic variations. The resulting framework achieves substantial improvements over existing methods across multiple OOD benchmarks, demonstrating superior and more stable cross-domain generalization performance.

Technology Category

Application Category

📝 Abstract
Out-of-Distribution (OOD) generalization has become a primary metric for evaluating image analysis systems. Since deep learning models tend to capture domain-specific context, they often develop shortcut dependencies on these non-causal features, leading to inconsistent performance across different data sources. Current techniques, such as invariance learning, attempt to mitigate this. However, they struggle to isolate highly mixed features within deep latent spaces. This limitation prevents them from fully resolving the shortcut learning problem.In this paper, we propose Hierarchical Causal Dropout (HCD), a method that uses channel-level causal masks to enforce feature sparsity. This approach allows the model to separate causal features from spurious ones, effectively performing a causal intervention at the representation level. The training is guided by a Matrix-based Mutual Information (MMI) objective to minimize the mutual information between latent features and domain labels, while simultaneously maximizing the information shared with class labels.To ensure stability, we incorporate a StyleMix-driven VICReg module, which prevents the masks from accidentally filtering out essential causal data. Experimental results on OOD benchmarks show that HCD performs better than existing top-tier methods.
Problem

Research questions and friction points this paper is trying to address.

Out-of-Distribution Generalization
Domain Invariance
Shortcut Learning
Causal Features
Feature Sparsity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Causal Dropout
channel-level sparsification
causal intervention
Matrix-based Mutual Information
Out-of-Distribution Generalization
🔎 Similar Papers
No similar papers found.