LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current large language models (LLMs) often degenerate into greedy decoding over the most salient components of soft inputs when performing “soft reasoning” in continuous concept spaces, thereby suppressing reasoning-path diversity and undermining the expressive potential of soft tokens. This work identifies this mechanistic limitation and proposes injecting controllable stochasticity to unlock soft reasoning capabilities: we explicitly model uncertainty during generation by integrating Dirichlet resampling with the Gumbel-Softmax reparameterization. Using probing techniques to analyze internal representations, we validate the approach across eight reasoning benchmarks. Results demonstrate that Gumbel-Softmax significantly enhances soft reasoning performance, shifting LLMs from deterministic decoding toward more exploratory, continuous-space inference. Our method establishes a novel paradigm for soft abstraction modeling—grounded in principled uncertainty quantification—while preserving architectural compatibility and computational tractability.

Technology Category

Application Category

📝 Abstract
Human cognition naturally engages with abstract and fluid concepts, whereas existing reasoning models often rely on generating discrete tokens, potentially constraining their expressive capabilities. Recent advancements aim to address this limitation by enabling large language models (LLMs) to generate soft, abstract tokens, thus facilitating reasoning within a continuous concept space. This paper explores the `Soft Thinking' capabilities of various LLMs by examining the models' internal behavior using a suite of probing techniques. Contrary to the common belief that Soft Thinking enables the simultaneous exploration of diverse reasoning paths, our findings reveal that LLMs predominantly rely on the most influential component of the soft inputs during subsequent decoding steps. This reliance hinders the exploration of different reasoning paths and reduces vanilla Soft Thinking to a form of greedy decoding, obscuring the advantage of transmitting more information through Soft Tokens. To tackle this issue, we explore sampling strategies to introduce emph{randomness}, employing methods such as Dirichlet resampling and the Gumbel-Softmax trick. Our experiments demonstrate that incorporating randomness can alleviate the limitations of vanilla approaches and unleash the potential of Soft Thinking. Notably, the Gumbel-Softmax trick provides adequate randomness with controlled smoothness, resulting in superior performance across eight reasoning benchmarks.
Problem

Research questions and friction points this paper is trying to address.

LLMs lack fluid abstract reasoning like human cognition
Soft Thinking in LLMs reduces to greedy decoding
Randomness strategies improve Soft Thinking performance in reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generating soft abstract tokens for reasoning
Using probing techniques to analyze model behavior
Employing Gumbel-Softmax for controlled randomness
🔎 Similar Papers
No similar papers found.
C
Chunhung Wu
Baidu Inc., Beijing, China
J
Jinliang Lu
Baidu Inc., Beijing, China
Z
Zixuan Ren
Baidu Inc., Beijing, China
G
Gangqiang Hu
Baidu Inc., Beijing, China
Z
Zhi Wu
Baidu Inc., Beijing, China
Dai Dai
Dai Dai
Baidu
Natural Language ProcessingNatural Language UnderstandingInformation ExtractionText MiningSentiment Analysis
H
Hua Wu
Baidu Inc., Beijing, China