๐ค AI Summary
Existing integrity-checking methods for detecting inference errors in large language models (LLMs) caused by memory bit flips suffer from high computational overhead and significant latency. This paper proposes a lightweight, online detection and localized recovery framework: it generates compact hash signatures via forward execution of short test vectors, enabling rapid fault localization through hash comparison; subsequently, it triggers block-level weight repairโbypassing full-model reloading. The core innovation lies in the synergistic co-design of a hash-guided verification mechanism and localization-driven localized recovery. Experimental evaluation across multiple mainstream LLMs demonstrates single- and multi-bit flip detection rates of 94% and โ100%, respectively, with only 1โ7.7% inference overhead and recovery speed over 100ร faster than conventional model reloading.
๐ Abstract
This paper presents LM-Fix, a lightweight detection and rapid recovery framework for faults in large language models (LLMs). Existing integrity approaches are often heavy or slow for modern LLMs. LM-Fix runs a short test-vector pass and uses hash-guided checks to detect bit-flip faults, then repairs them locally without a full reload. Across multiple models, it detects over 94% of single-bit flips at TVL=200 and nearly 100% of multi-bit flips with approximately 1% to 7.7% runtime overhead; recovery is more than 100x faster than reloading. These results show a practical, low-overhead solution to keep LLMs reliable in production