PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches

📅 2024-10-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

To address the challenge of sustaining personalized adaptation for downstream users amid frequent large language model (LLM) updates, this paper proposes PortLLM: a training-free, cross-version reusable lightweight model patching mechanism. PortLLM leverages parameter-space alignment and incremental perturbation modeling to seamlessly transfer domain-specific knowledge acquired on legacy LLMs to newer versions—without requiring additional training or fine-tuning data. It is compatible with prevalent parameter-efficient fine-tuning (PEFT) paradigms such as LoRA and supports mainstream open-source architectures including Mistral, Llama, and Gemma. Evaluated on seven benchmark datasets (e.g., BoolQ, GSM8K), PortLLM matches LoRA’s performance while reducing peak GPU memory consumption by up to 12.2×. A theoretical analysis formally guarantees patch portability across model versions. To our knowledge, PortLLM is the first approach enabling zero-cost, low-overhead, and sustainable LLM personalization evolution.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) increasingly shape the AI landscape, fine-tuning pretrained models has become more popular than in the pre-LLM era for achieving optimal performance in domain-specific tasks. However, pretrained LLMs such as ChatGPT are periodically evolved, i.e., model parameters are frequently updated), making it challenging for downstream users with limited resources to keep up with fine-tuning the newest LLMs for their domain application. Even though fine-tuning costs have nowadays been reduced thanks to the innovations of parameter-efficient fine-tuning such as LoRA, not all downstream users have adequate computing for frequent personalization. Moreover, access to fine-tuning datasets, particularly in sensitive domains such as healthcare, could be time-restrictive, making it crucial to retain the knowledge encoded in earlier fine-tuned rounds for future adaptation. In this paper, we present PortLLM, a training-free framework that (i) creates an initial lightweight model update patch to capture domain-specific knowledge, and (ii) allows a subsequent seamless plugging for the continual personalization of evolved LLM at minimal cost. Our extensive experiments cover seven representative datasets, from easier question-answering tasks {BoolQ, SST2} to harder reasoning tasks {WinoGrande, GSM8K}, and models including {Mistral-7B, Llama2, Llama3.1, and Gemma2}, validating the portability of our designed model patches and showcasing the effectiveness of our proposed framework. For instance, PortLLM achieves comparable performance to LoRA fine-tuning with reductions of up to 12.2x in GPU memory usage. Finally, we provide theoretical justifications to understand the portability of our model update patches, which offers new insights into the theoretical dimension of LLMs' personalization.

Problem

Research questions and friction points this paper is trying to address.

Addresses frequent LLM updates hindering downstream fine-tuning

Reduces resource-intensive personalization for evolving LLMs

Enables portable knowledge retention without retraining

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free framework for LLM personalization

Lightweight model update patches for domain knowledge

Seamless plugging for continual LLM adaptation

🔎 Similar Papers

No similar papers found.