Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach

📅 2024-08-13
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
In edge- and on-premises-deployed, goal-oriented customer service dialogue scenarios, existing large language model (LLM) solutions struggle to balance performance, controllability, and cost—proprietary models (e.g., GPT-4) incur high licensing fees and lack self-hosting capability, while open-source lightweight models suffer from insufficient capability. Method: We propose “policy distillation,” a novel black-box, interpretable knowledge transfer paradigm comprising two stages: scenario-aware generation and policy optimization. It constructs an auditable, transferable prompt policy library—eliminating reliance on parameter fine-tuning or response imitation—by integrating black-box API invocation, scenario-driven policy induction, and automated prompt engineering. Contribution/Results: Experiments demonstrate substantial improvement in user satisfaction for lightweight LLMs on customer service tasks; the distilled policies exhibit strong generalization across models and tasks; and built-in human review support enhances safety and operational controllability.

Technology Category

Application Category

📝 Abstract
Advanced Large language models (LLMs) like GPT-4 or LlaMa 3 provide superior performance in complex human-like interactions. But they are costly, or too large for edge devices such as smartphones and harder to self-host, leading to security and privacy concerns. This paper introduces a novel interpretable knowledge distillation approach to enhance the performance of smaller, more economical LLMs that firms can self-host. We study this problem in the context of building a customer service agent aimed at achieving high customer satisfaction through goal-oriented dialogues. Unlike traditional knowledge distillation, where the"student"model learns directly from the"teacher"model's responses via fine-tuning, our interpretable"strategy"teaching approach involves the teacher providing strategies to improve the student's performance in various scenarios. This method alternates between a"scenario generation"step and a"strategies for improvement"step, creating a customized library of scenarios and optimized strategies for automated prompting. The method requires only black-box access to both student and teacher models; hence it can be used without manipulating model parameters. In our customer service application, the method improves performance, and the learned strategies are transferable to other LLMs and scenarios beyond the training set. The method's interpretabilty helps safeguard against potential harms through human audit.
Problem

Research questions and friction points this paper is trying to address.

Improving goal-oriented dialog performance for cost-effective LLMs
Addressing trade-offs between performance, control, and deployment costs
Enabling knowledge transfer from advanced to smaller language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-based knowledge distillation framework
Extracts tactical guidance from teacher LLM
Retrieves guidance from structured library during inference
🔎 Similar Papers