🤖 AI Summary
Deep learning models often rely on spurious correlations, leading to substantial degradation in worst-group performance under subgroup distribution shifts. To address this, we propose an embedding-space regularization framework that establishes, for the first time, a theoretical connection between embedding representations and worst-group error. We prove that the direction of mean embedding differences across groups effectively separates core features from spurious ones. Leveraging this insight, we design a direction-sensitive regularization term applied directly to the embedding layer, explicitly suppressing reliance on spurious features while promoting learning of invariant core features. Our method requires no additional annotations or data augmentation and enjoys both theoretical guarantees and implementation simplicity. Extensive experiments on multiple vision and language benchmarks demonstrate significant improvements in worst-group accuracy—achieving state-of-the-art performance—and notably enhance generalization robustness, particularly for minority subgroups and out-of-distribution scenarios.
📝 Abstract
Deep learning models achieve strong performance across various domains but often rely on spurious correlations, making them vulnerable to distribution shifts. This issue is particularly severe in subpopulation shift scenarios, where models struggle in underrepresented groups. While existing methods have made progress in mitigating this issue, their performance gains are still constrained. They lack a rigorous theoretical framework connecting the embedding space representations with worst-group error. To address this limitation, we propose Spurious Correlation-Aware Embedding Regularization for Worst-Group Robustness (SCER), a novel approach that directly regularizes feature representations to suppress spurious cues. We show theoretically that worst-group error is influenced by how strongly the classifier relies on spurious versus core directions, identified from differences in group-wise mean embeddings across domains and classes. By imposing theoretical constraints at the embedding level, SCER encourages models to focus on core features while reducing sensitivity to spurious patterns. Through systematic evaluation on multiple vision and language, we show that SCER outperforms prior state-of-the-art studies in worst-group accuracy. Our code is available at href{https://github.com/MLAI-Yonsei/SCER}{https://github.com/MLAI-Yonsei/SCER}.