SplitCom: Communication-efficient Split Federated Fine-tuning of LLMs via Temporal Compression

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of on-device federated fine-tuning of large language models, which are constrained by limited computation and memory resources as well as high communication overhead. To mitigate these issues, the authors propose SplitCom, a framework that partitions the model between client and server and introduces a temporal activation compression mechanism inspired by video compression, uploading activations only when significant changes occur. Communication costs are further reduced through adaptive threshold control—implemented via either Bang-Bang control or deep deterministic policy gradient (DDPG) reinforcement learning—and dimensionality reduction techniques. The framework is extended into a U-shaped architecture to preserve label privacy. Experiments demonstrate that SplitCom reduces uplink communication costs by 98.6% under standard settings, while the U-shaped variant achieves a 95.8% reduction in total communication cost, all without compromising model performance.

Technology Category

Application Category

📝 Abstract
Federated fine-tuning of on-device large language models (LLMs) mitigates privacy concerns by preventing raw data sharing. However, the intensive computational and memory demands pose significant challenges for resource-constrained edge devices. To overcome these limitations, split federated learning (SFL) emerges as a promising solution that partitions the model into lightweight client-side and compute-intensive server-side sub-models, thus offloading the primary training workload to a powerful server. Nevertheless, high-dimensional activation exchanges in SFL lead to excessive communication overhead. To overcome this, we propose SplitCom, a communication-efficient SFL framework for LLMs that exploits temporal redundancy in activations across consecutive training epochs. Inspired by video compression, the core innovation of our framework lies in selective activation uploading only when a noticeable deviation from previous epochs occurs. To balance communication efficiency and learning performance, we introduce two adaptive threshold control schemes based on 1) bang-bang control or 2) deep deterministic policy gradient (DDPG)-based reinforcement learning. Moreover, we implement dimensionality reduction techniques to alleviate client-side memory requirements. Furthermore, we extend SplitCom to the U-shape architecture, ensuring the server never accesses clients'labels. Extensive simulations and laboratory experiments demonstrate that SplitCom reduces uplink communication costs by up to 98.6\,\% in its standard configuration and total communication costs by up to 95.8\,\% in its U-shape variant without noticeably compromising model performance.
Problem

Research questions and friction points this paper is trying to address.

split federated learning
large language models
communication overhead
activation exchange
edge devices
Innovation

Methods, ideas, or system contributions that make the work stand out.

Split Federated Learning
Temporal Compression
Activation Sparsification
Communication Efficiency
U-shape Architecture
Tao Li
Tao Li
City University of Hong Kong
Game TheoryReinforcement LearningSecurityIntelligent TransportationNetworks
Y
Yulin Tang
Department of Computer Science, Grinnell College, Iowa, United States
Y
Yiyang Song
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong SAR, China
C
Cong Wu
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong SAR, China
Xihui Liu
Xihui Liu
University of Hong Kong, UC Berkeley, CUHK, Tsinghua University
Computer VisionDeep Learning
P
Pan Li
Hangzhou Dianzi University, China
Xianhao Chen
Xianhao Chen
Assistant Professor, The University of Hong Kong
Wireless networksmobile edge computingedge AIdistributed learning