LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

πŸ“… 2025-10-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current large language models (LLMs) rely on autoregressive decoding for chain-of-thought (CoT) reasoning, limiting holistic evaluation and optimization of early reasoning steps and hindering efficient exploration of diverse solution paths. To address this, we propose LaDiRβ€”a latent-space iterative reasoning framework grounded in continuous representations. First, a variational autoencoder (VAE) encodes reasoning steps into structured latent blocks. Second, a block-level bidirectional attention-masked latent diffusion model enables parallel denoising and multi-path generation, facilitating holistic reasoning planning and test-time adaptive refinement. Experiments demonstrate that LaDiR significantly improves accuracy, reasoning diversity, and interpretability over state-of-the-art autoregressive, diffusion-based, and latent reasoning methods on mathematical reasoning and planning benchmarks.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) demonstrate their reasoning ability through chain-of-thought (CoT) generation. However, LLM's autoregressive decoding may limit the ability to revisit and refine earlier tokens in a holistic manner, which can also lead to inefficient exploration for diverse solutions. In this paper, we propose LaDiR (Latent Diffusion Reasoner), a novel reasoning framework that unifies the expressiveness of continuous latent representation with the iterative refinement capabilities of latent diffusion models for an existing LLM. We first construct a structured latent reasoning space using a Variational Autoencoder (VAE) that encodes text reasoning steps into blocks of thought tokens, preserving semantic information and interpretability while offering compact but expressive representations. Subsequently, we utilize a latent diffusion model that learns to denoise a block of latent thought tokens with a blockwise bidirectional attention mask, enabling longer horizon and iterative refinement with adaptive test-time compute. This design allows efficient parallel generation of diverse reasoning trajectories, allowing the model to plan and revise the reasoning process holistically. We conduct evaluations on a suite of mathematical reasoning and planning benchmarks. Empirical results show that LaDiR consistently improves accuracy, diversity, and interpretability over existing autoregressive, diffusion-based, and latent reasoning methods, revealing a new paradigm for text reasoning with latent diffusion.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLMs' holistic reasoning refinement capabilities
Improving efficiency of diverse solution exploration in text
Unifying latent representation with iterative diffusion refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent diffusion enables iterative refinement of reasoning
VAE encodes reasoning steps into compact latent tokens
Blockwise bidirectional attention allows holistic reasoning revision
πŸ”Ž Similar Papers
No similar papers found.