What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

141K/year

🤖 AI Summary

This work systematically investigates the impact of knowledge distillation on the fairness and class-wise bias of deep neural networks. We evaluate distillation across diverse benchmarks—CIFAR-100, Tiny ImageNet, and ImageNet for class accuracy shifts, and CelebA, TriFeature, and HateXplain for fairness assessment—using demographic parity difference, equalized odds difference, and individual fairness metrics. Our analysis reveals that distillation induces significant non-uniform accuracy shifts across classes, with up to 41% of classes exhibiting substantial deviations. Crucially, we find that increasing the distillation temperature consistently improves both group-level and individual fairness of student models, sometimes surpassing teacher model performance on fairness metrics. This is the first study to empirically characterize distillation’s heterogeneous effects on class-wise accuracy distributions and to demonstrate that elevated temperature serves as a controllable lever for fairness enhancement. These findings introduce novel evaluation dimensions and optimization strategies for deploying distilled models in fairness-sensitive applications.

Technology Category

Application Category

📝 Abstract

Knowledge Distillation is a commonly used Deep Neural Network (DNN) compression method, which often maintains overall generalization performance. However, we show that even for balanced image classification datasets, such as CIFAR-100, Tiny ImageNet and ImageNet, as many as 41% of the classes are statistically significantly affected by distillation when comparing class-wise accuracy (i.e. class bias) between a teacher/distilled student or distilled student/non-distilled student model. Changes in class bias are not necessarily an undesirable outcome when considered outside of the context of a model's usage. Using two common fairness metrics, Demographic Parity Difference (DPD) and Equalized Odds Difference (EOD) on models trained with the CelebA, Trifeature, and HateXplain datasets, our results suggest that increasing the distillation temperature improves the distilled student model's fairness, and the distilled student fairness can even surpass the fairness of the teacher model at high temperatures. Additionally, we examine individual fairness, ensuring similar instances receive similar predictions. Our results confirm that higher temperatures also improve the distilled student model's individual fairness. This study highlights the uneven effects of distillation on certain classes and its potentially significant role in fairness, emphasizing that caution is warranted when using distilled models for sensitive application domains.

Problem

Research questions and friction points this paper is trying to address.

Impact of distillation on class bias in image classification

Effect of distillation temperature on model fairness metrics

Influence of distillation on individual fairness in predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Higher distillation temperature improves fairness

Distillation affects class bias significantly

Individual fairness enhanced by temperature increase

🔎 Similar Papers

Revisiting Knowledge Distillation under Distribution Shift