Knowledge Distillation Detection for Open-weights Models

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper introduces the knowledge distillation detection task: determining whether a student model was generated via knowledge distillation from a given teacher model, under realistic constraints—only the student’s weights and black-box access to the teacher’s API are available (no training data, no teacher weights). For the open-weight scenario, we propose the first model-agnostic, data-free detection framework, integrating adversarial input synthesis with cross-architecture statistical feature analysis to uniformly support both classification and generative models. Our method requires no fine-tuning or training, relying solely on black-box API queries and internal weight analysis. Evaluated on CIFAR-10 and ImageNet for image classification, and on text-to-image generation, it achieves detection accuracy improvements of 59.6%, 71.2%, and 20.0%, respectively, significantly outperforming existing baselines. This work provides the first practical solution for model provenance verification and intellectual property protection in deep learning.

Technology Category

Application Category

📝 Abstract
We propose the task of knowledge distillation detection, which aims to determine whether a student model has been distilled from a given teacher, under a practical setting where only the student's weights and the teacher's API are available. This problem is motivated by growing concerns about model provenance and unauthorized replication through distillation. To address this task, we introduce a model-agnostic framework that combines data-free input synthesis and statistical score computation for detecting distillation. Our approach is applicable to both classification and generative models. Experiments on diverse architectures for image classification and text-to-image generation show that our method improves detection accuracy over the strongest baselines by 59.6% on CIFAR-10, 71.2% on ImageNet, and 20.0% for text-to-image generation. The code is available at https://github.com/shqii1j/distillation_detection.
Problem

Research questions and friction points this paper is trying to address.

Detecting unauthorized knowledge distillation from teacher models
Verifying model provenance using student weights and teacher APIs
Identifying distillation in classification and generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects distillation using model weights and API
Combines data-free synthesis with statistical scoring
Works for both classification and generative models
🔎 Similar Papers
No similar papers found.