Parameter-Efficient Fine-Tuning via Circular Convolution

📅 2024-07-27
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited expressiveness of Low-Rank Adaptation (LoRA) stemming from its intrinsic low-rank constraint, this paper proposes Circular Convolution Adaptation (C³A)—the first parameter-efficient fine-tuning (PEFT) framework integrating learnable circular convolution operators. C³A generates structured, high-rank weight updates while introducing parameter overhead comparable to LoRA, thereby substantially enhancing modeling capacity. Its modular, lightweight plug-in design enables seamless integration into arbitrary Transformer layers without architectural modification. Extensive experiments across natural language understanding (NLU), natural language generation (NLG), and multimodal adaptation tasks demonstrate that C³A consistently outperforms LoRA and its variants: it achieves an average accuracy gain of 1.8% under identical parameter budgets and reduces GPU memory consumption by 12%.

Technology Category

Application Category

📝 Abstract
Low-Rank Adaptation (LoRA) has gained popularity for fine-tuning large foundation models, leveraging low-rank matrices $mathbf{A}$ and $mathbf{B}$ to represent weight changes (i.e., $Delta mathbf{W} = mathbf{B} mathbf{A}$). This method reduces trainable parameters and mitigates heavy memory consumption associated with full delta matrices by sequentially multiplying $mathbf{A}$ and $mathbf{B}$ with the activation. Despite its success, the intrinsic low-rank characteristic may limit its performance. Although several variants have been proposed to address this issue, they often overlook the crucial computational and memory efficiency brought by LoRA. In this paper, we propose Circular Convolution Adaptation (C$^3$A), which not only achieves high-rank adaptation with enhanced performance but also excels in both computational power and memory utilization. Extensive experiments demonstrate that C$^3$A consistently outperforms LoRA and its variants across various fine-tuning tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhances performance beyond low-rank LoRA limitations
Maintains computational and memory efficiency in adaptation
Proposes high-rank Circular Convolution Adaptation (C³A)
Innovation

Methods, ideas, or system contributions that make the work stand out.

Circular Convolution Adaptation (C3A) for high-rank performance
Enhances computational and memory efficiency
Outperforms LoRA in diverse fine-tuning tasks
🔎 Similar Papers
No similar papers found.
Aochuan Chen
Aochuan Chen
The Hong Kong University of Science and Technology (Guangzhou)
Machine LearningApplied Data Science
Ziqi Gao
Ziqi Gao
HKUST
AI for ProteinGraph Machine Learning
Z
Zijing Liu
The Hong Kong University of Science and Technology (Guangzhou)
Y
Yu Li
The Hong Kong University of Science and Technology (Guangzhou)
J
Jia Li
The Hong Kong University of Science and Technology (Guangzhou)