🤖 AI Summary
Although vector-based adaptation methods are parameter-efficient, they require a high rank to match the performance of LoRA, leading to increased training overhead. This work proposes a gradient-guided initialization strategy that, for the first time, incorporates gradient information into the basis initialization of vector-based adaptation. By doing so, the method reduces the required rank by 8× while maintaining an extremely low parameter count, substantially improving training efficiency. Evaluated across natural language understanding, generation, and image classification tasks, the proposed approach matches or even surpasses the performance of both LoRA and existing vector-based adaptation methods, achieving highly efficient and effective parameter-efficient fine-tuning.
📝 Abstract
As model sizes continue to grow, parameter-efficient fine-tuning has emerged as a powerful alternative to full fine-tuning. While LoRA is widely adopted among these methods, recent research has explored vector-based adaptation methods due to their extreme parameter efficiency. However, these methods typically require substantially higher ranks than LoRA to match its performance, leading to increased training costs. This work introduces GiVA, a gradient-based initialization strategy for vector-based adaptation. It achieves training times comparable to LoRA and maintains the extreme parameter efficiency of vector-based adaptation. We evaluate GiVA across diverse benchmarks, including natural language understanding, natural language generation, and image classification. Experiments show that our approach consistently outperforms or achieves performance competitive with existing vector-based adaptation methods and LoRA while reducing rank requirements by a factor of eight ($8\times$).