🤖 AI Summary
Performance gains in low-level vision models often come at the cost of reduced interpretability, hindering their trustworthy deployment. Method: This paper introduces causal inference to low-level vision for the first time, proposing the task- and model-agnostic Causal Effect Map (CEM) framework. CEM employs counterfactual reasoning and gradient-driven local effect decomposition to visualize and quantify the positive or negative causal influence of each input pixel on the output. Contribution/Results: CEM uncovers several counterintuitive principles—e.g., “larger receptive fields do not necessarily improve performance” and “channel attention is ineffective in denoising”—enabling cross-task diagnostic analysis and knowledge refinement. It is validated across diverse low-level vision tasks—including denoising, super-resolution, and deblurring—significantly enhancing behavioral understanding of deep models. The implementation is publicly available, advancing practical research in interpretable low-level vision.
📝 Abstract
Deep neural networks have significantly improved the performance of low-level vision tasks but also increased the difficulty of interpretability. A deep understanding of deep models is beneficial for both network design and practical reliability. To take up this challenge, we introduce causality theory to interpret low-level vision models and propose a model-/task-agnostic method called Causal Effect Map (CEM). With CEM, we can visualize and quantify the input-output relationships on either positive or negative effects. After analyzing various low-level vision tasks with CEM, we have reached several interesting insights, such as: (1) Using more information of input images (e.g., larger receptive field) does NOT always yield positive outcomes. (2) Attempting to incorporate mechanisms with a global receptive field (e.g., channel attention) into image denoising may prove futile. (3) Integrating multiple tasks to train a general model could encourage the network to prioritize local information over global context. Based on the causal effect theory, the proposed diagnostic tool can refresh our common knowledge and bring a deeper understanding of low-level vision models. Codes are available at https://github.com/J-FHu/CEM.