Measuring the Sensitivity of Classification Models with the Error Sensitivity Profile

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This study addresses the challenge of quantifying the sensitivity of classification models to feature errors in training data by proposing the Error Sensitivity Profile (ESP)—a novel metric that systematically defines and measures the impact of single or multiple feature errors on model performance. Through experiments on two widely used datasets involving 14 classification models and leveraging a custom-developed dirty data toolkit, the authors demonstrate that model performance degradation is not necessarily correlated with simple feature–target variable associations. ESP effectively identifies the error types and critical features that most significantly impair predictive accuracy, thereby offering actionable guidance for prioritizing data cleaning efforts.

📝 Abstract

The quality of training data is critical to the performance of machine learning models. In this paper, the Error Sensitivity Profile (ESP) is proposed. It quantifies the sensitivity of model performance to errors in a single feature or in multiple features. By leveraging ESP, data-cleaning efforts can be prioritized based on error types and features most likely to affect model performance. To support the computation of this metric, an integrated suite of tools, called \dirty, is created. We conduct an extensive experimental study on two widely used datasets using 14 classification models, revealing that performance degradation is not always predictable from simple correlations with the target variable.

Problem

Research questions and friction points this paper is trying to address.

error sensitivity

classification models

data quality

feature errors

model performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Error Sensitivity Profile

data quality

model sensitivity