Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation

📅 2025-07-09

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Existing vision-language prompt learning relies on text descriptions generated by large language models (LLMs), suffering from semantic instability and poor generalization. To address this, we propose a description-free multi-prompt learning framework: it discards discrete textual templates and instead distills LLM knowledge directly into continuous, differentiable prompt vectors; introduces a dynamic prompt weighting mechanism to adaptively model the importance of each prompt for downstream tasks; and enables end-to-end joint optimization. This is the first approach to eliminate dependence on either handcrafted or LLM-generated textual descriptions. Evaluated on 11 cross-domain recognition benchmarks, our method consistently outperforms state-of-the-art prompt learning methods, demonstrating superior semantic expressiveness, training stability, and task generalization capability.

Technology Category

Application Category

📝 Abstract

Recent advances in pre-trained Vision Language Models (VLM) have shown promising potential for effectively adapting to downstream tasks through prompt learning, without the need for additional annotated paired datasets. To supplement the text information in VLM trained on correlations with vision data, new approaches leveraging Large Language Models (LLM) in prompts have been proposed, enhancing robustness to unseen and diverse data. Existing methods typically extract text-based responses (i.e., descriptions) from LLM to incorporate into prompts; however, this approach suffers from high variability and low reliability. In this work, we propose Description-free Multi-prompt Learning(DeMul), a novel method that eliminates the process of extracting descriptions and instead directly distills knowledge from LLM into prompts. By adopting a description-free approach, prompts can encapsulate richer semantics while still being represented as continuous vectors for optimization, thereby eliminating the need for discrete pre-defined templates. Additionally, in a multi-prompt setting, we empirically demonstrate the potential of prompt weighting in reflecting the importance of different prompts during training. Experimental results show that our approach achieves superior performance across 11 recognition datasets.

Problem

Research questions and friction points this paper is trying to address.

Eliminates unreliable LLM description extraction for prompts

Directly distills LLM knowledge into continuous prompt vectors

Introduces weighted multi-prompt learning for improved recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Directly distills LLM knowledge into prompts

Eliminates description extraction for richer semantics

Uses prompt weighting in multi-prompt training

🔎 Similar Papers

No similar papers found.