Deep Attention Reweighting: Post-Hoc Attention-Based Feature Aggregation in CNNs for Disentangling Core and Spurious Features under Spurious Correlations

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Convolutional neural networks often exhibit poor generalization and fairness issues due to their reliance on spurious correlations in training data. This work identifies global average pooling as a key factor that entangles core features with spurious ones during feature aggregation. To address this, the authors propose a retrainable attention-based aggregation module as a post-processing step, which adaptively weights spatial locations prior to aggregation to selectively suppress spurious features. The method jointly optimizes the classification head and feature aggregation without requiring modifications to the backbone network. Experimental results demonstrate that the approach significantly outperforms existing Debiased Feature Reweighting (DFR) methods across multiple datasets and evaluation metrics, effectively reducing the model’s dependence on spurious correlations.

📝 Abstract

Convolutional Neural Networks (CNNs) often exploit spurious correlations in datasets, learning superficially predictive yet causally irrelevant features, leading to poor generalization and fairness issues. Deep Feature Reweighting (DFR) is a post-hoc technique that reduces a trained model's reliance on spurious correlations by retraining its classification head on a target dataset. However, we show that DFR is fundamentally constrained by operating on entangled features, limiting its ability to amplify the core features while simultaneously suppressing the spurious ones. We trace this entanglement to the ubiquitous Global Average Pooling (GAP) layer, which indiscriminately collapses spatially distinct core and spurious features into a single representation. To address this, we propose Deep Attention Reweighting (DAR), a post-hoc attention-based aggregation module that replaces GAP and is retrained jointly with the classification head. DAR computes an adaptive weighting of spatial locations across feature maps, enabling selective suppression of spurious features before the collapse into entangled features. Across various datasets, metrics, and ablations, DAR consistently outperforms DFR, demonstrating that our attention-based aggregation mitigates GAP-induced entanglement and reduces spurious reliance.

Problem

Research questions and friction points this paper is trying to address.

spurious correlations

feature entanglement

global average pooling

post-hoc reweighting

core features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Attention Reweighting

spurious correlations

feature disentanglement