SeLaR: Selective Latent Reasoning in Large Language Models

πŸ“… 2026-04-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses key limitations in existing chain-of-thought (CoT) and latent reasoning approaches: discrete token representations constrain expressivity, while global soft embeddings in latent methods often induce high-confidence step perturbations and embedding collapse, undermining stability and exploration. To overcome these issues, the authors propose SeLaRβ€”a lightweight, training-free hybrid reasoning framework that employs an entropy-gated mechanism to activate soft embeddings only during low-confidence reasoning steps, preserving discrete decoding for high-confidence ones. Additionally, SeLaR incorporates entropy-aware contrastive regularization to sustain multi-path exploration. This design effectively avoids disrupting reliable inference steps and prevents soft embedding collapse, yielding significant performance gains over standard CoT and state-of-the-art training-free baselines across five reasoning benchmarks.
πŸ“ Abstract
Chain-of-Thought (CoT) has become a cornerstone of reasoning in large language models, yet its effectiveness is constrained by the limited expressiveness of discrete token sampling. Recent latent reasoning approaches attempt to alleviate this limitation by replacing discrete tokens with soft embeddings (probability-weighted mixtures of token embeddings) or hidden states, but they commonly suffer from two issues: (1) global activation injects perturbations into high-confidence steps, impairing reasoning stability; and (2) soft embeddings quickly collapse toward the highest-probability token, limiting exploration of alternative trajectories. To address these challenges, we propose SeLaR (Selective Latent Reasoning), a lightweight and training-free framework. SeLaR introduces an entropy-gated mechanism that activates soft embeddings only at low-confidence steps, while preserving discrete decoding at high-confidence steps. Additionally, we propose an entropy-aware contrastive regularization that pushes soft embeddings away from the dominant (highest-probability) token's direction, encouraging sustained exploration of multiple latent reasoning paths. Experiments on five reasoning benchmarks demonstrate that SeLaR consistently outperforms standard CoT and state-of-the-art training-free methods.
Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought
latent reasoning
soft embeddings
reasoning stability
token collapse
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective Latent Reasoning
Chain-of-Thought
Soft Embeddings
Entropy-Gated Mechanism
Contrastive Regularization
πŸ”Ž Similar Papers
No similar papers found.
R
Renyu Fu
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
Guibo Luo
Guibo Luo
Peking University
medical imagingprivacy computing