NaRA: Noise-Aware LoRA for Parameter-Efficient Fine-Tuning of Diffusion LLMs

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a key limitation of existing parameter-efficient fine-tuning methods, such as LoRA, which disregard the dynamic variation of noise levels in diffusion-based large language models and thus struggle to adapt to the evolving input distributions and generation difficulty during denoising. To overcome this, the authors propose Noise-aware Low-Rank Adaptation (NaRA), the first approach to integrate a noise-aware mechanism into the LoRA framework. NaRA employs a lightweight hypernetwork that dynamically generates low-rank core matrices conditioned on the current noise level, enabling continuous adaptation of tuning parameters along the diffusion trajectory. Despite incurring minimal computational overhead, NaRA substantially enhances model performance across diverse tasks—including commonsense reasoning, mathematical reasoning, and code generation—demonstrating its effectiveness and broad applicability.
📝 Abstract
Diffusion Large Language Models (dLLMs) have emerged as a promising non-autoregressive generative paradigm. Given the prohibitive computational cost of full fine-tuning, Parameter-Efficient Fine-Tuning (PEFT) has become the standard approach. However, existing PEFT methods (e.g., LoRA), originally tailored for autoregressive models, rely on static parameters that are agnostic to the noise level. Consequently, they ignore the intrinsic dynamics of the diffusion process, where input distributions and generation difficulty shift significantly along the denoising trajectory, rendering them suboptimal for dLLMs. To address this, we propose Noise-aware Low-Rank Adaptation (NaRA), which introduces a low-rank core matrix generated by a lightweight, globally shared hypernetwork conditioned on the noise level. This design enables the update matrices to vary continuously along the diffusion process while keeping parameter and latency overhead negligible. We provide a theoretical justification for the proposed NaRA framework and empirically demonstrate consistent improvements over noise-agnostic baselines across commonsense reasoning, mathematical reasoning, and code generation benchmarks. Our code is available at https://github.com/generaldi/NaRA.
Problem

Research questions and friction points this paper is trying to address.

Diffusion LLMs
Parameter-Efficient Fine-Tuning
noise-aware
LoRA
non-autoregressive generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-Aware
Parameter-Efficient Fine-Tuning
Diffusion LLMs
Low-Rank Adaptation
Hypernetwork
🔎 Similar Papers
No similar papers found.
S
Shuaidi Wang
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Z
Zhan Zhuang
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Computer Science, City University of Hong Kong, Hong Kong, China
R
Ruping Huang
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China
Yu Zhang
Yu Zhang
Associate Professor, Southern University of Science and Technology
Artificial IntelligenceMachine Learning