ULPT: Prompt Tuning with Ultra-Low-Dimensional Optimization

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Large language models (LLMs) are prohibitively expensive to fine-tune due to their massive parameter counts; while prompt tuning offers parameter efficiency, its prompt embeddings are tied to the model’s embedding dimension, limiting scalability. This paper proposes Ultra-Low-Dimensional Prompt Tuning (ULPT), which compresses trainable prompts into an extremely low-dimensional space—e.g., two dimensions—and projects them back to the original embedding dimension via a fixed random matrix. To preserve semantic expressivity, ULPT further introduces learnable shift and scale embeddings. We theoretically show that such random projection preserves high-rank structural properties of the original prompt space. Experiments across 21 NLP tasks demonstrate that ULPT achieves performance on par with standard prompt tuning using only 2% of its trainable parameters, significantly outperforming mainstream parameter-efficient fine-tuning methods. ULPT thus offers superior scalability, practicality, and parameter efficiency without sacrificing accuracy.

Technology Category

Application Category

📝 Abstract

Large language models achieve state-of-the-art performance but are costly to fine-tune due to their size. Parameter-efficient fine-tuning methods, such as prompt tuning, address this by reducing trainable parameters while maintaining strong performance. However, prior methods tie prompt embeddings to the model's dimensionality, which may not scale well with larger LLMs and more customized LLMs. In this paper, we propose Ultra-Low-dimensional Prompt Tuning (ULPT), which optimizes prompts in a low-dimensional space (e.g., 2D) and use a random but frozen matrix for the up-projection. To enhance alignment, we introduce learnable shift and scale embeddings. ULPT drastically reduces the trainable parameters, e.g., 2D only using 2% parameters compared with vanilla prompt tuning while retaining most of the performance across 21 NLP tasks. Our theoretical analysis shows that random projections can capture high-rank structures effectively, and experimental results demonstrate ULPT's competitive performance over existing parameter-efficient methods.

Problem

Research questions and friction points this paper is trying to address.

Reducing trainable parameters in large language models

Optimizing prompts in ultra-low-dimensional space

Maintaining performance across diverse NLP tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ultra-Low-dimensional Prompt Tuning

Random frozen matrix up-projection

Learnable shift and scale embeddings

🔎 Similar Papers

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts