Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

This work addresses the accuracy degradation of language models caused by inference optimizations—such as quantization, pruning, or erroneous serialization—without access to the original training data. We propose a lightweight, data-free recovery method grounded in synthetic data generation, logit distillation, and selective LoRA fine-tuning: low-rank adapters are learned only at critical layers to approximate the behavior of the high-precision model. To our knowledge, this is the first approach achieving fully data-free, architecture-agnostic accuracy recovery, compatible with diverse attention mechanisms and degradation sources. Evaluated across multiple small language models, our method restores accuracy by 5–17% on average, substantially enhancing post-deployment performance. It achieves strong parameter efficiency—introducing negligible overhead—and maintains practical engineering viability.

Technology Category

Application Category

📝 Abstract

Inference optimizations such as quantization, pruning, format and datatype conversion, model export, and serialization can lead to functional degradations in language model task performance. While most efforts on performance recovery for deployment focus on robust quantization techniques, we focus on recovering model accuracies from any sources that degrade model weights, such as improper model serialization. In this work, we propose Recover-LoRA, a lightweight and dataset agnostic method to recover accuracy in degraded models. Recover-LoRA uses synthetic data and logit distillation to learn LoRA adapters on selective layers that facilitate aligning the degraded model to its full precision model. We investigate the utility of Recover-LoRA across a diverse set of small language models (SLMs), including models with varying attention architectures, multi-head attention (MHA) and group-query attention (GQA), as well as several evaluation datasets. Our results show that Recover-LoRA recovers model accuracies by 5-17% on MHA and GQA SLMs.

Problem

Research questions and friction points this paper is trying to address.

Recovering accuracy from degraded language models after inference optimizations

Addressing model weight degradation from improper serialization and conversions

Restoring task performance in small language models with varying architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses synthetic data for model recovery

Applies logit distillation to degraded models

Learns LoRA adapters on selective layers

🔎 Similar Papers

Healing Powers of BERT: How Task-Specific Fine-Tuning Recovers Corrupted Language Models

2024-06-20arXiv.orgCitations: 2

Cerebras Systems

Sunnyvale CA or Toronto Canada / Headquarters/Sunnyvale Office, Sunnyvale, CA / Toronto Office, Toronto, Ontario, Canada

Authors to Follow