From Noise to Diversity: Random Embedding Injection in LLM Reasoning

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

188K/year
🤖 AI Summary
This work investigates whether the performance gains from soft prompts in large language model inference stem from the learned content or merely from the act of prompt injection. To disentangle these factors, the authors propose a training-agnostic mechanism called Random Soft Prompts (RSP), which injects isotropic Gaussian embeddings—resampled anew for each inference—into the input to perturb the reasoning path without any training. Experiments demonstrate that RSP substantially enhances early-token diversity and Pass@N performance, achieving accuracy on mathematical reasoning benchmarks comparable to that of optimized soft prompts. Further analysis reveals that the structural act of prompt injection itself induces a universal path-bifurcation effect. The approach is successfully extended to the DAPO training framework, confirming its effectiveness during both inference and training phases.
📝 Abstract
Recent soft prompt research has tried to improve reasoning by inserting trained vectors into LLM inputs, yet whether the gain comes from the learned content or from the act of injection itself has not been carefully separated. We study Random Soft Prompts (RSPs), which drop the training step entirely and append a freshly drawn sequence of random embedding vectors to the input. Each RSP vector is sampled from an isotropic Gaussian fitted to the entrywise mean and variance of the pretrained embedding table; the sequence carries no learned content, and yet reaches accuracy comparable to optimized soft prompts on math reasoning benchmarks in several settings. The mechanism unfolds in two stages: because attention has to absorb a never-seen-before random position, the distribution over the first few generated tokens flattens and reasoning trajectories branch, and as generation continues this influence dilutes naturally so the response commits to a single completion. We show that during inference RSPs lift early-stage token diversity and, combined with temperature sampling, widen Pass@N, the probability that at least one out of N attempts is correct. Beyond inference, we carry the same effect into DAPO training and demonstrate practical gains. Our contributions are: (i) RSP isolates the simplest form of soft prompt -- training-free, freshly resampled -- providing a unified lens for the structural effect of injection that variants otherwise differing in training and form all share; (ii) a theoretical and empirical validation of the underlying mechanism; and (iii) an extension from inference to training.
Problem

Research questions and friction points this paper is trying to address.

soft prompts
reasoning
random embedding
large language models
injection effect
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Soft Prompts
LLM Reasoning
Prompt Injection
Token Diversity
DAPO Training
🔎 Similar Papers