NEAT: Neuron-Based Early Exit for Large Reasoning Models

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the inefficiency of large reasoning models that often generate redundant reasoning steps due to “overthinking,” thereby wasting computational resources. The authors propose a neuron-level early-exit mechanism that requires no additional training, external data, or test-time computation. By dynamically monitoring internal neuron activation patterns during inference, the method terminates redundant reasoning as soon as a correct answer is confidently reached. This approach achieves training-agnostic control over reasoning paths for the first time, reducing token consumption by 22%–28% on average across six diverse models of varying scales and architectures, while preserving original task accuracy on four standard reasoning benchmarks.

Technology Category

Application Category

📝 Abstract

Large Reasoning Models (LRMs) often suffer from \emph{overthinking}, a phenomenon in which redundant reasoning steps are generated after a correct solution has already been reached. Existing early reasoning exit methods primarily rely on output-level heuristics or trained probing models to skip redundant reasoning steps, thereby mitigating overthinking. However, these approaches typically require additional rollout computation or externally labeled datasets. In this paper, we propose \textbf{NEAT}, a \textbf{N}euron-based \textbf{E}arly re\textbf{A}soning exi\textbf{T} framework that monitors neuron-level activation dynamics to enable training-free early exits, without introducing additional test-time computation. NEAT identifies exit-associated neurons and tracks their activation patterns during reasoning to dynamically trigger early exit or suppress reflection, thereby reducing unnecessary reasoning while preserving solution quality. Experiments on four reasoning benchmarks across six models with different scales and architectures show that, for each model, NEAT achieves an average token reduction of 22\% to 28\% when averaged over the four benchmarks, while maintaining accuracy.

Problem

Research questions and friction points this paper is trying to address.

overthinking

Large Reasoning Models

early exit

redundant reasoning

reasoning efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

neuron-level activation

early exit

large reasoning models