🤖 AI Summary
Traditional prompt learning suffers from overfitting and poor out-of-distribution generalization. To address this, we propose Repulsive Bayesian Prompt Learning (ReBaPL), a Bayesian inference framework that models the posterior distribution over prompt parameters. ReBaPL introduces repulsive potential functions—based on Maximum Mean Discrepancy (MMD) and Wasserstein distance—to explicitly enforce diversity in the representation space. It further employs Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) with cyclical step-size scheduling to enable efficient multi-modal posterior sampling, effectively mitigating mode collapse. ReBaPL is modular and compatible with existing prompt-learning methods in a plug-and-play manner. Empirical evaluation across multiple benchmarks demonstrates substantial improvements in both generalization performance and posterior representation quality. By unifying uncertainty-aware prompting with structured posterior regularization, ReBaPL establishes a new paradigm for robust and scalable prompt learning.
📝 Abstract
Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.