From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models

๐Ÿ“… 2025-09-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates the causal influence of explicit reasoning traces on final answer generation in large reasoning models (LRMs). Addressing the lack of clarity regarding how reasoning processes mechanistically affect answers, we introduce โ€œReasoning-Focused Headsโ€ (RFHs)โ€”a novel concept identifying attention heads that track reasoning trajectories. We localize and validate RFHs at intermediate layers through integrated attention analysis, activation patching interventions, and empirical evaluation. Experiments demonstrate that explicit reasoning substantially improves answer quality; moreover, targeted perturbations of key reasoning tokens attended to by RFHs consistently alter final outputs, confirming a causal dependency of answers on reasoning paths. To our knowledge, this is the first systematic characterization of directed information flow from reasoning to answers within LRMs. Our findings establish a new paradigm for interpretable and controllable reasoning, offering principled foundations for model introspection, diagnostic intervention, and reasoning-aware architecture design.

Technology Category

Application Category

๐Ÿ“ Abstract
Large Reasoning Models (LRMs) generate explicit reasoning traces alongside final answers, yet the extent to which these traces influence answer generation remains unclear. In this work, we conduct a three-stage investigation into the interplay between reasoning and answer generation in three distilled DeepSeek R1 models. First, through empirical evaluation, we demonstrate that including explicit reasoning consistently improves answer quality across diverse domains. Second, attention analysis reveals that answer tokens attend substantially to reasoning tokens, with certain mid-layer Reasoning-Focus Heads (RFHs) closely tracking the reasoning trajectory, including self-reflective cues. Third, we apply mechanistic interventions using activation patching to assess the dependence of answer tokens on reasoning activations. Our results show that perturbations to key reasoning tokens can reliably alter the final answers, confirming a directional and functional flow of information from reasoning to answer. These findings deepen our understanding of how LRMs leverage reasoning tokens for answer generation, highlighting the functional role of intermediate reasoning in shaping model outputs. Our data and code are publicly available at href{https://aka.ms/R2A-code}{this URL}.
Problem

Research questions and friction points this paper is trying to address.

Investigating how reasoning traces influence answer generation in distilled models
Analyzing attention patterns between reasoning tokens and answer tokens
Assessing functional dependence of answers on reasoning through mechanistic interventions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical evaluation shows reasoning improves answer quality
Attention analysis reveals reasoning tokens guide answer generation
Mechanistic interventions confirm reasoning functionally shapes final answers
๐Ÿ”Ž Similar Papers
No similar papers found.