Black-Box On-Policy Distillation of Large Language Models

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Knowledge distillation from large language models (LLMs) in black-box settings—where teacher model parameters and internal logic are inaccessible—remains challenging due to reliance on limited textual outputs. Method: We propose Generative Adversarial Distillation (GAD), an end-to-end adversarial framework comprising a generator (student model) and a dynamic reward model (discriminator), trained solely on teacher-generated text. GAD unifies policy-gradient optimization, sequence-level knowledge distillation, and reward modeling to enable cooperative, policy-level distillation. Contribution/Results: GAD significantly improves training stability and generalization. Empirical evaluation shows that the distilled Qwen2.5-14B-Instruct achieves performance on LMSYS-Chat comparable to GPT-5-Chat, substantially outperforming conventional white-box and black-box distillation baselines. This establishes a novel paradigm for efficient model compression and transfer under resource-constrained conditions.

Technology Category

Application Category

📝 Abstract
Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work, we introduce Generative Adversarial Distillation (GAD), which enables on-policy and black-box distillation. GAD frames the student LLM as a generator and trains a discriminator to distinguish its responses from the teacher LLM's, creating a minimax game. The discriminator acts as an on-policy reward model that co-evolves with the student, providing stable, adaptive feedback. Experimental results show that GAD consistently surpasses the commonly used sequence-level knowledge distillation. In particular, Qwen2.5-14B-Instruct (student) trained with GAD becomes comparable to its teacher, GPT-5-Chat, on the LMSYS-Chat automatic evaluation. The results establish GAD as a promising and effective paradigm for black-box LLM distillation.
Problem

Research questions and friction points this paper is trying to address.

Developing black-box distillation using only teacher text outputs
Creating on-policy training without internal model access
Improving student LLM performance to match proprietary teachers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-box distillation using generative adversarial training
Discriminator co-evolves with student as reward model
On-policy learning without teacher model internals
🔎 Similar Papers
No similar papers found.