🤖 AI Summary
This work investigates whether knowledge distillation (KD) is necessary and effective for bundle generation tasks powered by large language models (LLMs), aiming to sustain high performance at low computational cost. To this end, it systematically disentangles the impact of KD along three dimensions—format, knowledge volume, and utilization strategy—for the first time. The authors propose a hierarchical knowledge extraction framework, capturing pattern-level, rule-based, and deep-reasoning knowledge, coupled with a multi-paradigm adaptation mechanism supporting in-context learning (ICL), supervised fine-tuning (SFT), and hybrid fine-tuning. Through progressive knowledge extraction and multi-granularity quantization fusion, the approach significantly improves the efficiency–performance trade-off of student models: achieving ≥92% of teacher-model performance across multiple benchmarks while reducing inference cost by 67%. The core contribution lies in establishing the critical role of structured KD in bundle generation and delivering a reusable, lightweight deployment paradigm.
📝 Abstract
LLMs are increasingly explored for bundle generation, thanks to their reasoning capabilities and knowledge. However, deploying large-scale LLMs introduces significant efficiency challenges, primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge distillation (KD) offers a promising solution, transferring expertise from large teacher models to compact student models. This study systematically investigates knowledge distillation approaches for bundle generation, aiming to minimize computational demands while preserving performance. We explore three critical research questions: (1) how does the format of KD impact bundle generation performance? (2) to what extent does the quantity of distilled knowledge influence performance? and (3) how do different ways of utilizing the distilled knowledge affect performance? We propose a comprehensive KD framework that (i) progressively extracts knowledge (patterns, rules, deep thoughts); (ii) captures varying quantities of distilled knowledge through different strategies; and (iii) exploits complementary LLM adaptation techniques (in-context learning, supervised fine-tuning, combination) to leverage distilled knowledge in small student models for domain-specific adaptation and enhanced efficiency. Extensive experiments provide valuable insights into how knowledge format, quantity, and utilization methodologies collectively shape LLM-based bundle generation performance, exhibiting KD's significant potential for more efficient yet effective LLM-based bundle generation.