🤖 AI Summary
To address the challenge of simultaneously balancing trainable parameter count, model performance, and computational/storage efficiency in parameter-efficient fine-tuning (PEFT) of large language models, this paper proposes UORA. Its core is a uniform interpolation-based reinitialization mechanism under orthogonal constraints, enabling structure-aware sparse row/column updates in frozen projection matrices based on vector magnitudes—thereby synergistically optimizing low-rank approximation and interpolation-style reparameterization. UORA introduces the novel “uniform orthogonal reinitialization” paradigm, fundamentally distinct from LoRA and VeRA. Empirically, it achieves state-of-the-art performance across GLUE, E2E NLG, instruction tuning, and image classification benchmarks. Compared to LoRA, UORA reduces trainable parameters significantly; relative to VeRA, it incurs lower computational and memory overhead; and critically, it introduces virtually zero inference latency.
📝 Abstract
This paper introduces Uniform Orthogonal Reinitialization Adaptation (UORA), a novel parameter-efficient fine-tuning (PEFT) approach for Large Language Models (LLMs). UORA achieves state-of-the-art performance and parameter efficiency by leveraging a low-rank approximation method to reduce the number of trainable parameters. Unlike existing methods such as LoRA and VeRA, UORA employs an interpolation-based reparametrization mechanism that selectively reinitializes rows and columns in frozen projection matrices, guided by the vector magnitude heuristic. This results in substantially fewer trainable parameters compared to LoRA and outperforms VeRA in computation and storage efficiency. Comprehensive experiments across various benchmarks demonstrate UORA's superiority in achieving competitive fine-tuning performance with negligible computational overhead. We demonstrate its performance on GLUE and E2E benchmarks and its effectiveness in instruction-tuning large language models and image classification models. Our contributions establish a new paradigm for scalable and resource-efficient fine-tuning of LLMs.