DeCAF: Decentralized Consensus-And-Factorization for Low-Rank Adaptation of Foundation Models

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

To address the slow convergence and lack of theoretical guarantees in decentralized LoRA training—caused by gradient nonsmoothness and model consensus interference—this paper establishes the first convergence theory for such settings, proving that DeCAF achieves the optimal convergence rate matching distributed SGD. DeCAF integrates truncated SVD-based low-rank updates with explicit consensus constraints, effectively eliminating consensus interference while preserving communication and computational efficiency. Our theoretical analysis rigorously characterizes how low-rank parameter coupling affects convergence under nonsmooth objectives. Experiments demonstrate that DeCAF significantly outperforms local training on vision and language tasks, matches federated learning performance under both IID and non-IID data distributions, and exhibits stable convergence and excellent scalability.

Technology Category

Application Category

📝 Abstract

Low-Rank Adaptation (LoRA) has emerged as one of the most effective, computationally tractable fine-tuning approaches for training Vision-Language Models (VLMs) and Large Language Models (LLMs). LoRA accomplishes this by freezing the pre-trained model weights and injecting trainable low-rank matrices, allowing for efficient learning of these foundation models even on edge devices. However, LoRA in decentralized settings still remains under explored, particularly for the theoretical underpinnings due to the lack of smoothness guarantee and model consensus interference (defined formally below). This work improves the convergence rate of decentralized LoRA (DLoRA) to match the rate of decentralized SGD by ensuring gradient smoothness. We also introduce DeCAF, a novel algorithm integrating DLoRA with truncated singular value decomposition (TSVD)-based matrix factorization to resolve consensus interference. Theoretical analysis shows TSVD's approximation error is bounded and consensus differences between DLoRA and DeCAF vanish as rank increases, yielding DeCAF's matching convergence rate. Extensive experiments across vision/language tasks demonstrate our algorithms outperform local training and rivals federated learning under both IID and non-IID data distributions.

Problem

Research questions and friction points this paper is trying to address.

Improving decentralized LoRA convergence rate

Resolving model consensus interference in DLoRA

Enhancing efficiency for edge device foundation model training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized LoRA with gradient smoothness

TSVD-based matrix factorization for consensus

Matching convergence rate to decentralized SGD

🔎 Similar Papers

Fast Decentralized Federated Low Rank Matrix Recovery from Column-wise Linear Projections