🤖 AI Summary
Existing information bottleneck (IB)-based feature attribution methods model only single-layer representations, overlooking the distributed nature of decision evidence across multiple layers in vision transformers. To address this, we propose Cross-layer Information Bottleneck Attribution (CoIBA), the first IB framework that jointly optimizes multiple target layers. CoIBA applies coordinated noise compression to features across layers via shared-parameter decay ratios and enforces inter-layer mutual information constraints using a variational upper bound. This mechanism leverages inter-layer correlations to recover decision-relevant cues missed by single-layer approaches, thereby substantially improving attribution completeness and faithfulness. Extensive experiments demonstrate that CoIBA outperforms state-of-the-art methods across multiple quantitative attribution evaluation metrics, achieving more accurate identification of input regions critical to model decisions.
📝 Abstract
The feature attribution method reveals the contribution of input variables to the decision-making process to provide an attribution map for explanation. Existing methods grounded on the information bottleneck principle compute information in a specific layer to obtain attributions, compressing the features by injecting noise via a parametric damping ratio. However, the attribution obtained in a specific layer neglects evidence of the decision-making process distributed across layers. In this paper, we introduce a comprehensive information bottleneck (CoIBA), which discovers the relevant information in each targeted layer to explain the decision-making process. Our core idea is applying information bottleneck in multiple targeted layers to estimate the comprehensive information by sharing a parametric damping ratio across the layers. Leveraging this shared ratio complements the over-compressed information to discover the omitted clues of the decision by sharing the relevant information across the targeted layers. We suggest the variational approach to fairly reflect the relevant information of each layer by upper bounding layer-wise information. Therefore, CoIBA guarantees that the discarded activation is unnecessary in every targeted layer to make a decision. The extensive experimental results demonstrate the enhancement in faithfulness of the feature attributions provided by CoIBA.