AR-MAP: Are Autoregressive Large Language Models Implicit Teachers for Diffusion Large Language Models?

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion-based large language models (DLLMs) struggle with efficient preference alignment due to the high variance induced by ELBO-based likelihood estimation. This work proposes AR-MAP, a novel framework that, for the first time, demonstrates how autoregressive and diffusion language models can share an architecture to enable knowledge transfer. By leveraging a pre-aligned autoregressive model as an implicit teacher, AR-MAP transfers alignment knowledge to the DLLM through lightweight weight scaling, circumventing the high variance and computational overhead of direct alignment. The method requires no complex training procedures and achieves an average score of 69.08% across multiple preference alignment benchmarks, matching or surpassing existing DLLM-specific alignment approaches.

Technology Category

Application Category

📝 Abstract
Diffusion Large Language Models (DLLMs) have emerged as a powerful alternative to autoregressive models, enabling parallel token generation across multiple positions. However, preference alignment of DLLMs remains challenging due to high variance introduced by Evidence Lower Bound (ELBO)-based likelihood estimation. In this work, we propose AR-MAP, a novel transfer learning framework that leverages preference-aligned autoregressive LLMs (AR-LLMs) as implicit teachers for DLLM alignment. We reveal that DLLMs can effectively absorb alignment knowledge from AR-LLMs through simple weight scaling, exploiting the shared architectural structure between these divergent generation paradigms. Crucially, our approach circumvents the high variance and computational overhead of direct DLLM alignment and comprehensive experiments across diverse preference alignment tasks demonstrate that AR-MAP achieves competitive or superior performance compared to existing DLLM-specific alignment methods, achieving 69.08\% average score across all tasks and models. Our Code is available at https://github.com/AMAP-ML/AR-MAP.
Problem

Research questions and friction points this paper is trying to address.

Diffusion Large Language Models
preference alignment
Evidence Lower Bound
high variance
autoregressive LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion Large Language Models
Preference Alignment
Transfer Learning
Autoregressive Models
Weight Scaling
🔎 Similar Papers
No similar papers found.