How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

This work addresses the unclear parameterized memory capacity and dynamic mechanisms of Low-Rank Adaptation (LoRA) in large language model fine-tuning. Treating LoRA as a controllable memory probe in latent space, the study systematically quantifies its memorization capability and, for the first time, reveals a deterministic phase transition phenomenon in LoRA-based memory. It establishes a power-law relationship among loss reduction, effective parameter count, and sequence length. Building on these insights, the authors propose MemFT, a threshold-guided dynamic optimization strategy, and prove that a prediction probability $p > 0.5$ is a sufficient condition for verbatim recall. This leads to a parameterized memory law that informs training resource allocation. Experiments demonstrate that MemFT significantly enhances both memory fidelity and training efficiency, validating the proposed law.

📝 Abstract

Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.

Problem

Research questions and friction points this paper is trying to address.

LoRA

parametric memory

memory capacity

LLM finetuning

memory dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parametric Memory Law

LoRA

memory fidelity