Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability

📅 2025-04-22

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Large language models (LLMs) face deployment challenges in resource-constrained environments due to prohibitive computational and memory overhead. To address this, we systematically compare diverse knowledge distillation (KD) approaches and propose, for the first time, a *critique-revision prompting* framework to generate high-quality, explanation-rich distilled data. Our method integrates multi-strategy training and provides a unified evaluation of KD’s dual impact on student model accuracy and interpretability. Empirical evaluation on CommonsenseQA demonstrates that our approach preserves student model accuracy while significantly improving explanation quality, logical consistency, and human comprehensibility. Crucially, this work is the first to empirically characterize the inherent trade-off between accuracy and interpretability in LLM distillation. It establishes a reproducible, interpretable pathway for deploying compact, edge-ready LLMs in real-world applications.

Technology Category

Application Category

📝 Abstract

Artificial Intelligence (AI) has increasingly influenced modern society, recently in particular through significant advancements in Large Language Models (LLMs). However, high computational and storage demands of LLMs still limit their deployment in resource-constrained environments. Knowledge distillation addresses this challenge by training a small student model from a larger teacher model. Previous research has introduced several distillation methods for both generating training data and for training the student model. Despite their relevance, the effects of state-of-the-art distillation methods on model performance and explainability have not been thoroughly investigated and compared. In this work, we enlarge the set of available methods by applying critique-revision prompting to distillation for data generation and by synthesizing existing methods for training. For these methods, we provide a systematic comparison based on the widely used Commonsense Question-Answering (CQA) dataset. While we measure performance via student model accuracy, we employ a human-grounded study to evaluate explainability. We contribute new distillation methods and their comparison in terms of both performance and explainability. This should further advance the distillation of small language models and, thus, contribute to broader applicability and faster diffusion of LLM technology.

Problem

Research questions and friction points this paper is trying to address.

Investigates impact of knowledge distillation on model performance and explainability

Compares state-of-the-art distillation methods for training small student models

Evaluates new distillation techniques using accuracy and human-grounded explainability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Critique-revision prompting for data generation

Synthesizing existing methods for training

Systematic comparison on Commonsense Question-Answering dataset

🔎 Similar Papers

No similar papers found.