π€ AI Summary
This work addresses the knowledge gaps in large language models (LLMs) for recommendation tasks, which arise from imbalanced pretraining data distributions and lead to uneven understanding of different items. To mitigate this issue, the authors propose KnowSA_CKP, a novel approach that introduces a knowledge-aware selective augmentation mechanism. By leveraging Comparative Knowledge Probing, the method evaluates the modelβs intrinsic comprehension of item co-occurrence relationships and dynamically determines whether to inject external knowledge, thereby avoiding redundant supplementation for already well-understood items. Evaluated on four real-world datasets without any fine-tuning, KnowSA_CKP significantly improves both recommendation accuracy and the efficiency of contextual budget utilization.
π Abstract
Large language models (LLMs) have recently emerged as powerful training-free recommenders. However, their knowledge of individual items is inevitably uneven due to imbalanced information exposure during pretraining, a phenomenon we refer to as knowledge gap problem. To address this, most prior methods have employed a naive uniform augmentation that appends external information for every item in the input prompt. However, this approach not only wastes limited context budget on redundant augmentation for well-known items but can also hinder the model's effective reasoning. To this end, we propose KnowSA_CKP (Knowledge-aware Selective Augmentation with Comparative Knowledge Probing) to mitigate the knowledge gap problem. KnowSA_CKP estimates the LLM's internal knowledge by evaluating its capability to capture collaborative relationships and selectively injects additional information only where it is most needed. By avoiding unnecessary augmentation for well-known items, KnowSA_CKP focuses on items that benefit most from knowledge supplementation, thereby making more effective use of the context budget. KnowSA_CKP requires no fine-tuning step, and consistently improves both recommendation accuracy and context efficiency across four real-world datasets.