🤖 AI Summary
Conventional model analysis predominantly focuses on erroneous predictions while overlooking inherent instance difficulty variations. Method: We propose a difficulty-aware model analysis framework that, for the first time, establishes a multidimensional instance difficulty metric grounded in data-, model-, and human-centric perspectives, and introduces a consistency evaluation mechanism between model confidence and human-perceived difficulty. Our approach integrates difficulty scoring, confidence analysis, annotation consistency checking, and interactive visualization, implemented in the DifficultyEyes tool for pattern-based diagnostic analysis. Contribution/Results: Experiments demonstrate that the framework effectively identifies canonical defects—including data noise, class imbalance, and model fragility—significantly enhancing difficult-instance detection and root-cause attribution. Empirical validation on benchmarks such as ImageNet confirms its practical utility for robustness assessment and data quality diagnosis.
📝 Abstract
Traditional instance-based model analysis focuses mainly on misclassified instances. However, this approach overlooks the varying difficulty associated with different instances. Ideally, a robust model should recognize and reflect the challenges presented by intrinsically difficult instances. It is also valuable to investigate whether the difficulty perceived by the model aligns with that perceived by humans. To address this, we propose incorporating instance difficulty into the deep neural network evaluation process, specifically for supervised classification tasks on image data. Specifically, we consider difficulty measures from three perspectives -- data, model, and human -- to facilitate comprehensive evaluation and comparison. Additionally, we develop an interactive visual tool, DifficultyEyes, to support the identification of instances of interest based on various difficulty patterns and to aid in analyzing potential data or model issues. Case studies demonstrate the effectiveness of our approach.