🤖 AI Summary
Large language models (LLMs) are prohibitively expensive to fine-tune due to their massive parameter counts; while prompt tuning offers parameter efficiency, its prompt embeddings are tied to the model’s embedding dimension, limiting scalability. This paper proposes Ultra-Low-Dimensional Prompt Tuning (ULPT), which compresses trainable prompts into an extremely low-dimensional space—e.g., two dimensions—and projects them back to the original embedding dimension via a fixed random matrix. To preserve semantic expressivity, ULPT further introduces learnable shift and scale embeddings. We theoretically show that such random projection preserves high-rank structural properties of the original prompt space. Experiments across 21 NLP tasks demonstrate that ULPT achieves performance on par with standard prompt tuning using only 2% of its trainable parameters, significantly outperforming mainstream parameter-efficient fine-tuning methods. ULPT thus offers superior scalability, practicality, and parameter efficiency without sacrificing accuracy.
📝 Abstract
Large language models achieve state-of-the-art performance but are costly to fine-tune due to their size. Parameter-efficient fine-tuning methods, such as prompt tuning, address this by reducing trainable parameters while maintaining strong performance. However, prior methods tie prompt embeddings to the model's dimensionality, which may not scale well with larger LLMs and more customized LLMs. In this paper, we propose Ultra-Low-dimensional Prompt Tuning (ULPT), which optimizes prompts in a low-dimensional space (e.g., 2D) and use a random but frozen matrix for the up-projection. To enhance alignment, we introduce learnable shift and scale embeddings. ULPT drastically reduces the trainable parameters, e.g., 2D only using 2% parameters compared with vanilla prompt tuning while retaining most of the performance across 21 NLP tasks. Our theoretical analysis shows that random projections can capture high-rank structures effectively, and experimental results demonstrate ULPT's competitive performance over existing parameter-efficient methods.