Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity

📅 2025-10-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Parameter-efficient fine-tuning (PEFT) of large language models (LLMs) suffers from computational inefficiency due to “shadowy sparsity”—a dynamic, fine-grained, and unmodeled sparsity pattern emerging during training. Method: This paper introduces Long Exposure, the first system explicitly designed to address shadowy sparsity. It proposes a novel sparse-aware long-range exposure mechanism, a serialized sparse predictor, and dynamic-aware operators supporting merged memory access. Contribution/Results: Long Exposure achieves up to 2.49× end-to-end speedup in PEFT while preserving fine-tuning accuracy—outperforming state-of-the-art methods—and significantly reduces the computational cost of LLM adaptation.

Technology Category

Application Category

📝 Abstract

The adaptation of pre-trained large language models (LLMs) to diverse downstream tasks via fine-tuning is critical for numerous applications. However, the inefficiency of parameter-efficient fine-tuning (PEFT) techniques presents significant challenges in terms of time investments and operational costs. In this paper, we first introduce a nuanced form of sparsity, termed Shadowy Sparsity, which is distinctive in fine-tuning and has not been adequately addressed for acceleration. Under Shadowy Sparsity, we propose Long Exposure, an efficient system to accelerate PEFT for LLMs. Long Exposure comprises three key components: Shadowy-sparsity Exposer employs a prolonged sensing range to capture more sparsity details under shadowy sparsity; Sequence-oriented Predictor provides efficient yet accurate predictions to handle large sequence inputs and constantly-evolving parameters; and Dynamic-aware Operator facilitates more structured computational patterns and coalesced memory accesses, addressing dynamic sparse operations. Extensive evaluations show that Long Exposure outperforms state-of-the-arts with up to a $2.49 imes$ speedup in end-to-end fine-tuning, offering promising advancements in accelerating PEFT for LLMs.

Problem

Research questions and friction points this paper is trying to address.

Accelerating parameter-efficient fine-tuning for large language models

Addressing shadowy sparsity inefficiencies in fine-tuning processes

Optimizing computational patterns for dynamic sparse operations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Employs prolonged sensing to expose shadowy sparsity details

Uses sequence-oriented predictor for efficient input handling

Implements dynamic-aware operator for structured sparse computations

🔎 Similar Papers

No similar papers found.

Authors to Follow