Missingness Bias Calibration in Feature Attribution Explanations

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the distortion of importance scores in feature attribution methods caused by missingness bias. To mitigate this issue, the authors propose MCal, a post-hoc correction approach that treats missingness bias as a superficial artifact in the output space of a frozen backbone model. MCal applies lightweight fine-tuning to a linear head without retraining or altering the original architecture. Relying solely on calibration within the output space, the method effectively alleviates missingness bias across diverse medical benchmarks—including vision, language, and tabular data—while matching or even surpassing the performance of existing heavyweight approaches. MCal thus offers a highly efficient and broadly applicable solution that maintains both computational economy and generalizability.

Technology Category

Application Category

📝 Abstract

Popular explanation methods often produce unreliable feature importance scores due to missingness bias, a systematic distortion that arises when models are probed with ablated, out-of-distribution inputs. Existing solutions treat this as a deep representational flaw that requires expensive retraining or architectural modifications. In this work, we challenge this assumption and show that missingness bias can be effectively treated as a superficial artifact of the model's output space. We introduce MCal, a lightweight post-hoc method that corrects this bias by fine-tuning a simple linear head on the outputs of a frozen base model. Surprisingly, we find this simple correction consistently reduces missingness bias and is competitive with, or even outperforms, prior heavyweight approaches across diverse medical benchmarks spanning vision, language, and tabular domains.

Problem

Research questions and friction points this paper is trying to address.

missingness bias

feature attribution

explanation reliability

out-of-distribution inputs

model interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

missingness bias

feature attribution

post-hoc calibration