CooperLLM: Cloud-Edge-End Cooperative Federated Fine-tuning for LLMs via ZOO-based Gradient Correction

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses the challenge of efficiently and privately fine-tuning large language models on resource-constrained mobile devices, where high memory and computational overheads hinder existing approaches. Current federated learning methods either rely on costly backpropagation or employ zeroth-order optimization (ZOO), which suffers from slow convergence and low accuracy. To overcome these limitations, we propose CooperLLM, a novel framework that uniquely integrates cloud-side gradient guidance with edge-side ZOO: mobile devices perform lightweight local updates via ZOO, while the cloud leverages public data to generate guiding perturbations through backpropagation, effectively correcting local gradients to accelerate convergence and improve accuracy. Additionally, pipeline scheduling and adaptive compression techniques are introduced to alleviate system bottlenecks. Experiments show that, compared to state-of-the-art ZOO methods, CooperLLM reduces device memory usage by up to 86.4%, achieves up to 8.8× faster convergence, and improves accuracy by as much as 10 percentage points.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) perform well on many NLP tasks, but fine-tuning them on resource-constrained mobile devices is challenging due to high memory and computation costs, despite growing demands for privacy-preserving personalization. Federated Learning (FL) enables local-data training, yet existing methods either rely on memory-intensive backpropagation or use zeroth-order optimization (ZOO), which avoids backward passes but suffers from slow convergence and degraded accuracy. We propose CooperLLM, a cloud-assisted edge-end cooperative federated fine-tuning framework that combines ZOO on mobile devices with cloud-guided gradient rectification. Mobile clients perform lightweight ZOO updates on private data, while the cloud fine-tunes on auxiliary public data using backpropagation and injects guided perturbations to rectify local updates, improving convergence and accuracy without violating privacy. To address system bottlenecks, CooperLLM introduces pipeline scheduling and adaptive compression to overlap computation and communication and reduce memory usage. Experiments on multiple Transformer models and datasets show that CooperLLM reduces on-device memory by up to $86.4\%$, accelerates convergence by $8.8 \times$, and improves accuracy by up to 10 percentage points over state-of-the-art ZOO-based baselines.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Federated Learning

Zeroth-Order Optimization

Resource-Constrained Devices

Privacy-Preserving Personalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Fine-tuning

Zeroth-Order Optimization (ZOO)

Cloud-Edge-End Collaboration