An Online Learning Approach to Prompt-based Selection of Generative Models

📅 2024-10-17
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Multi-text generation models exhibit significant performance variation across different prompts, and static model selection often leads to suboptimal invocation. Method: This paper proposes the first online learning framework for prompt-dependent model ranking, formulated as a contextual multi-armed bandit problem with shared context. We introduce PAK-UCB—a novel algorithm that integrates prompt embeddings with Random Fourier Features (RFF) to accelerate kernel regression. Theoretically, it achieves an Õ(√T) regret bound. Results: Evaluated on real-world and synthetic text-to-image and image-to-text tasks, our method substantially improves model selection accuracy (+12.7%–23.4%) and reduces average query cost (−18.3%–31.6%). These results demonstrate the effectiveness, generalizability, and practicality of prompt-aware dynamic model selection.

Technology Category

Application Category

📝 Abstract
Selecting a sample generation scheme from multiple text-based generative models is typically addressed by choosing the model that maximizes an averaged evaluation score. However, this score-based selection overlooks the possibility that different models achieve the best generation performance for different types of text prompts. An online identification of the best generation model for various input prompts can reduce the costs associated with querying sub-optimal models. In this work, we explore the possibility of varying rankings of text-based generative models for different text prompts and propose an online learning framework to predict the best data generation model for a given input prompt. The proposed PAK-UCB algorithm addresses a contextual bandit (CB) setting with shared context variables across the arms, utilizing the generated data to update kernel-based functions that predict the score of each model available for unseen text prompts. Additionally, we leverage random Fourier features (RFF) to accelerate the online learning process of PAK-UCB and establish a $widetilde{mathcal{O}}(sqrt{T})$ regret bound for the proposed RFF-based CB algorithm over $T$ iterations. Our numerical experiments on real and simulated text-to-image and image-to-text generative models show that RFF-UCB performs successfully in identifying the best generation model across different sample types.
Problem

Research questions and friction points this paper is trying to address.

Online selection of generative models
Contextual bandit with shared variables
Accelerated learning with random Fourier features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online learning framework
PAK-UCB algorithm
Random Fourier features
🔎 Similar Papers
No similar papers found.