Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

178K/year
🤖 AI Summary
Existing prompt learning methods for vision-language models often overlook class-specific knowledge in zero-shot classification, thereby limiting performance. To address this issue, this work proposes CAKI (Class-Aware Knowledge Injection), a plug-and-play framework that, for the first time, explicitly incorporates class-level knowledge into prompt learning. CAKI constructs a knowledge base by generating class-specific prompts from a few examples and dynamically retrieves relevant knowledge for each test sample via a query-key matching mechanism. The framework is compatible with existing approaches and effectively balances shared-class and instance-specific advantages. Extensive experiments demonstrate that CAKI significantly improves zero-shot classification performance on both base and novel classes. The code is publicly available.
📝 Abstract
Prompt learning has become an effective and widely used technique in enhancing vision-language models (VLMs) such as CLIP for various downstream tasks, particularly in zero-shot classification within specific domains. Existing methods typically focus on either learning class-shared prompts for a given domain or generating instance-specific prompts through conditional prompt learning. While these methods have achieved promising performance, they often overlook class-specific knowledge in prompt design, leading to suboptimal outcomes. The underlying reasons are: 1) class-specific prompts offer more fine-grained supervision compared to coarse class-shared prompts, which helps prevent misclassification of data from different classes into a single class; 2) compared to class-specific prompts, instance-specific prompts neglect the richer class-level information across multiple instances, potentially causing data from the same class to be divided into multiple classes. To effectively supplement the class-specific knowledge into existing methods, we propose a plug-and-play Class-Aware Knowledge Injection (CAKI) framework. CAKI comprises two key components, i.e., class-specific prompt generation and query-key prompt matching. The former encodes class-specific knowledge into prompts from few-shot samples that belong to the same class and stores the learned prompts in a class-level knowledge bank. The latter provides a plug-and-play mechanism for each test instance to retrieve relevant class-level knowledge from the knowledge bank and inject such knowledge to refine model predictions. Extensive experiments demonstrate that our CAKI effectively improves the performance of existing methods on base and novel classes. Code is publicly available at \href{https://github.com/yjh576/CAKI}{this https URL}.
Problem

Research questions and friction points this paper is trying to address.

prompt learning
class-specific knowledge
vision-language model
zero-shot classification
knowledge injection
Innovation

Methods, ideas, or system contributions that make the work stand out.

class-aware prompting
knowledge injection
prompt learning
vision-language models
few-shot classification