Stochastic CHAOS: Why Deterministic Inference Kills, and Distributional Variability Is the Heartbeat of Artifical Cognition

📅 2026-01-12

📈 Citations: 1

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work challenges the prevailing reliance on deterministic outputs in large language model (LLM) inference, which obscures inherent uncertainties, vulnerabilities, and safety risks while suppressing emergent capabilities and multi-path reasoning. To address this, the authors propose the “Stochastic CHAOS” framework, which treats output distribution variability as a core cognitive signal. By integrating multi-path sampling, non-deterministic attention mechanisms, and multi-sample evaluation, the framework systematically uncovers the true capabilities and latent risks of LLMs. Experimental results demonstrate that deterministic inference substantially underestimates both model performance and tail-end risks. In contrast, the proposed approach effectively recovers emergent phenomena, enhances reasoning accuracy, and reveals hidden safety hazards, thereby questioning conventional assumptions about reliability and reproducibility in LLM deployment.

Technology Category

Application Category

📝 Abstract

Deterministic inference is a comforting ideal in classical software: the same program on the same input should always produce the same output. As large language models move into real-world deployment, this ideal has been imported wholesale into inference stacks. Recent work from the Thinking Machines Lab has presented a detailed analysis of nondeterminism in LLM inference, showing how batch-invariant kernels and deterministic attention can enforce bitwise-identical outputs, positioning deterministic inference as a prerequisite for reproducibility and enterprise reliability. In this paper, we take the opposite stance. We argue that, for LLMs, deterministic inference kills. It kills the ability to model uncertainty, suppresses emergent abilities, collapses reasoning into a single brittle path, and weakens safety alignment by hiding tail risks. LLMs implement conditional distributions over outputs, not fixed functions. Collapsing these distributions to a single canonical completion may appear reassuring, but it systematically conceals properties central to artificial cognition. We instead advocate Stochastic CHAOS, treating distributional variability as a signal to be measured and controlled. Empirically, we show that deterministic inference is systematically misleading. Single-sample deterministic evaluation underestimates both capability and fragility, masking failure probability under paraphrases and noise. Phase-like transitions associated with emergent abilities disappear under greedy decoding. Multi-path reasoning degrades when forced onto deterministic backbones, reducing accuracy and diagnostic insight. Finally, deterministic evaluation underestimates safety risk by hiding rare but dangerous behaviors that appear only under multi-sample evaluation.

Problem

Research questions and friction points this paper is trying to address.

deterministic inference

distributional variability

large language models

artificial cognition

stochasticity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic CHAOS

distributional variability

nondeterministic inference