🤖 AI Summary
To address privacy risks—both model and data leakage—arising from centralized fine-tuning of large language models (LLMs), this paper proposes a training-free, gradient-preserving off-site fine-tuning method. Unlike existing off-site tuning (OT) approaches, which incur high computational overhead and lack theoretical grounding, our work is the first to formulate off-site fine-tuning from an optimization perspective. We design a gradient-information preservation mechanism based on rank compression and channel pruning, integrated with lightweight adapters to enable secure adaptation. Crucially, the method operates without accessing either the original training data or the base model’s parameters, thereby providing strict privacy guarantees while drastically reducing computational cost. Experiments across multiple benchmark tasks demonstrate superior performance over state-of-the-art OT methods, achieving a favorable balance of efficiency, theoretical rigor, and practicality. This work establishes a novel paradigm for privacy-preserving LLM fine-tuning.
📝 Abstract
The rapid growth of large language models (LLMs) with traditional centralized fine-tuning emerges as a key technique for adapting these models to domain-specific challenges, yielding privacy risks for both model and data owners. One promising solution, called offsite-tuning (OT), is proposed to address these challenges, where a weaker emulator is compressed from the original model and further fine-tuned with adapter to enhance privacy. However, the existing OT-based methods require high computational costs and lack theoretical analysis. This paper introduces a novel OT approach based on gradient-preserving compression, named GradOT. By analyzing the OT problem through the lens of optimization, we propose a method that selectively applies compression techniques such as rank compression and channel pruning, preserving the gradients of fine-tuned adapters while ensuring privacy. Extensive experiments demonstrate that our approach surpasses existing OT methods, both in terms of privacy protection and model performance. Our method provides a theoretical foundation for OT and offers a practical, training-free solution for offsite-tuning of large-scale LLMs.