NaRA: Noise-Aware LoRA for Parameter-Efficient Fine-Tuning of Diffusion LLMs

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses a key limitation of existing parameter-efficient fine-tuning methods, such as LoRA, which disregard the dynamic variation of noise levels in diffusion-based large language models and thus struggle to adapt to the evolving input distributions and generation difficulty during denoising. To overcome this, the authors propose Noise-aware Low-Rank Adaptation (NaRA), the first approach to integrate a noise-aware mechanism into the LoRA framework. NaRA employs a lightweight hypernetwork that dynamically generates low-rank core matrices conditioned on the current noise level, enabling continuous adaptation of tuning parameters along the diffusion trajectory. Despite incurring minimal computational overhead, NaRA substantially enhances model performance across diverse tasks—including commonsense reasoning, mathematical reasoning, and code generation—demonstrating its effectiveness and broad applicability.

📝 Abstract

Diffusion Large Language Models (dLLMs) have emerged as a promising non-autoregressive generative paradigm. Given the prohibitive computational cost of full fine-tuning, Parameter-Efficient Fine-Tuning (PEFT) has become the standard approach. However, existing PEFT methods (e.g., LoRA), originally tailored for autoregressive models, rely on static parameters that are agnostic to the noise level. Consequently, they ignore the intrinsic dynamics of the diffusion process, where input distributions and generation difficulty shift significantly along the denoising trajectory, rendering them suboptimal for dLLMs. To address this, we propose Noise-aware Low-Rank Adaptation (NaRA), which introduces a low-rank core matrix generated by a lightweight, globally shared hypernetwork conditioned on the noise level. This design enables the update matrices to vary continuously along the diffusion process while keeping parameter and latency overhead negligible. We provide a theoretical justification for the proposed NaRA framework and empirically demonstrate consistent improvements over noise-agnostic baselines across commonsense reasoning, mathematical reasoning, and code generation benchmarks. Our code is available at https://github.com/generaldi/NaRA.

Problem

Research questions and friction points this paper is trying to address.

Diffusion LLMs

Parameter-Efficient Fine-Tuning

noise-aware

LoRA

non-autoregressive generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-Aware

Parameter-Efficient Fine-Tuning

Diffusion LLMs