🤖 AI Summary
Existing data-agnostic knowledge distillation methods perform poorly on tabular data because they overlook feature interactions, which are central to the decision-making of tabular models. This work introduces the principle of “interaction diversity” and proposes a distillation framework explicitly designed to maximize coverage of feature interactions. The approach aligns the teacher model’s decision boundaries through adaptive feature binning and generates synthetic query samples that maximize pairwise interaction coverage. Evaluated across four benchmark datasets and four teacher architectures—yielding 16 distinct configurations—the method achieves the highest teacher-student consistency in 14 settings, significantly outperforming five state-of-the-art baselines. These results demonstrate the effectiveness and generalizability of the proposed framework for knowledge distillation in tabular domains.
📝 Abstract
Data-free knowledge distillation enables model compression without original training data, critical for privacy-sensitive tabular domains. However, existing methods does not perform well on tabular data because they do not explicitly address feature interactions, the fundamental way tabular models encode predictive knowledge. We identify interaction diversity, systematic coverage of feature combinations, as an essential requirement for effective tabular distillation. To operationalize this insight, we propose TabKD, which learns adaptive feature bins aligned with teacher decision boundaries, then generates synthetic queries that maximize pairwise interaction coverage. Across 4 benchmark datasets and 4 teacher architectures, TabKD achieves highest student-teacher agreement in 14 out of 16 configurations, outperforming 5 state-of-the-art baselines. We further show that interaction coverage strongly correlates with distillation quality, validating our core hypothesis. Our work establishes interaction-focused exploration as a principled framework for tabular model extraction.