🤖 AI Summary
To address the dual challenges of excessive memory overhead and user privacy leakage in large language model (LLM) deployment, this paper proposes the first differentially private (DP) fine-tuning framework integrating side networks and invertible architectures. Our method employs memory-aware gradient computation and a privacy-accuracy co-optimization strategy, achieving significant GPU memory reduction under strict DP guarantees (ε ≤ 2). Compared to standard DP-SGD, it overcomes inherent bottlenecks of high memory consumption and training inefficiency: across multiple tasks, it attains up to 58% memory compression while incurring only a 1.2-point drop in GLUE average score—preserving accuracy comparable to non-private baselines. The core contribution lies in the novel integration of invertible networks and side networks into DP fine-tuning, enabling, for the first time, simultaneous optimization of strong privacy protection and memory efficiency.
📝 Abstract
Large language models have repeatedly shown outstanding performance across diverse applications. However, deploying these models can inadvertently risk user privacy. The significant memory demands during training pose a major challenge in terms of resource consumption. This substantial size places a heavy load on memory resources, raising considerable practical concerns. In this paper, we introduce DP-MemArc, a novel training framework aimed at reducing the memory costs of large language models while emphasizing the protection of user data privacy. DP-MemArc incorporates side network or reversible network designs to support a variety of differential privacy memory-efficient fine-tuning schemes. Our approach not only achieves in memory optimization but also ensures robust privacy protection, keeping user data secure and confidential. Extensive experiments have demonstrated that DP-MemArc effectively provides differential privacy-efficient fine-tuning across different task scenarios.