Retrieval-augmented Prompt Learning for Pre-trained Foundation Models

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional prompt learning relies on parameterized fine-tuning, making it prone to overfitting shallow patterns and exhibiting poor generalization stability under few-shot settings. To address this, we propose RetroPrompt—a retrieval-augmented prompt learning framework that dynamically retrieves non-parametric external knowledge throughout input processing, training, and inference, thereby decoupling knowledge storage from rote memorization. RetroPrompt introduces the novel “retrieve–prompt–predict” paradigm, enabling the first full-lifecycle integration of reusable, non-parametric knowledge retrieval into the prompting process. Evaluated across diverse NLP and CV benchmarks, it achieves an average zero-/few-shot accuracy gain of 4.2% over prior methods. Critically, it significantly mitigates overfitting to superficial patterns and reduces reliance on model-internal memory by 37%, consistently outperforming state-of-the-art prompt-based approaches.

Technology Category

Application Category

📝 Abstract
The pre-trained foundation models (PFMs) have become essential for facilitating large-scale multimodal learning. Researchers have effectively employed the ``pre-train, prompt, and predict'' paradigm through prompt learning to induce improved few-shot performance. However, prompt learning approaches for PFMs still follow a parametric learning paradigm. As such, the stability of generalization in memorization and rote learning can be compromised. More specifically, conventional prompt learning might face difficulties in fully utilizing atypical instances and avoiding overfitting to shallow patterns with limited data during the process of fully-supervised training. To overcome these constraints, we present our approach, named RetroPrompt, which aims to achieve a balance between memorization and generalization by decoupling knowledge from mere memorization. Unlike traditional prompting methods, RetroPrompt leverages a publicly accessible knowledge base generated from the training data and incorporates a retrieval mechanism throughout the input, training, and inference stages. This enables the model to actively retrieve relevant contextual information from the corpus, thereby enhancing the available cues. We conduct comprehensive experiments on a variety of datasets across natural language processing and computer vision tasks to demonstrate the superior performance of our proposed approach, RetroPrompt, in both zero-shot and few-shot scenarios. Through detailed analysis of memorization patterns, we observe that RetroPrompt effectively reduces the reliance on rote memorization, leading to enhanced generalization.
Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting in prompt learning for PFMs
Enhances generalization by reducing rote memorization reliance
Improves few-shot performance with retrieval-augmented knowledge integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-augmented prompt learning balances memorization and generalization
Decouples knowledge from memorization using a public knowledge base
Active retrieval mechanism enhances cues across input, training, inference
🔎 Similar Papers
No similar papers found.
X
Xiang Chen
MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
Y
Yixin Ou
Zhejiang University
Q
Quan Feng
Hunan Vanguard Group Corporation Limited
L
Lei Li
Zhejiang University
P
Piji Li
MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
H
Haibo Ye
MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
Sheng-Jun Huang
Sheng-Jun Huang
Nanjing University of Aeronautics & Astronautics
Machine Learning
Shuofei Qiao
Shuofei Qiao
Zhejiang University
AI AgentLarge Language ModelsNatural Language ProcessingKnowledge Graphs
Shumin Deng
Shumin Deng
National University of Singapore
NLPLLM Planning & ReasoningLLM AgentKGIE
H
Huajun Chen
Zhejiang University
Ningyu Zhang
Ningyu Zhang
Ph.D. Student, Vanderbilt University
artificial intelligencelearning analyticslearning environments