Beyond Dense States: Elevating Sparse Transcoders to Active Operators for Latent Reasoning

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing latent variable inference methods rely on dense state transitions, lacking interpretability and controllability; while sparse representations offer interpretability, they have been limited to post-hoc analysis. This work proposes the Latent Sparse Transition Reasoning (LSTR) framework, which for the first time integrates a sparse transcoder as an active inference component. By employing a residual skip architecture, LSTR decouples linear manifold transitions from sparse semantic updates, enabling multi-step, controllable latent reasoning. The method incorporates explicit sparsity constraints, significantly enhancing model interpretability without compromising inference accuracy or compression efficiency. Furthermore, causal intervention experiments validate the causal efficacy of the learned sparse features in the reasoning process.

Technology Category

Application Category

📝 Abstract

Latent reasoning compresses the chain-of-thought (CoT) into continuous hidden states, yet existing methods rely on dense latent transitions that remain difficult to interpret and control. Meanwhile, sparse representation models uncover human-interpretable semantic features but remain largely confined to post-hoc analysis. We reconcile this tension by proposing LSTR (Latent Sparse Transcoder Reasoning), a latent reasoning framework that elevates functional sparse transcoders into active reasoning operators to perform multi-step computation through sparse semantic transitions. At its core, LSTR employs a Latent Transition Transcoder (LTT) with a residual skip architecture that decouples linear manifold transport from sparse semantic updates, enabling controllable semantic resolution via explicit sparsity constraints. Extensive experiments show that LSTR preserves reasoning accuracy and compression efficiency while substantially improving interpretability over dense latent baselines. Causal interventions and trajectory analyses further demonstrate that these sparse features act as both interpretable and causally effective operators in the reasoning process.

Problem

Research questions and friction points this paper is trying to address.

latent reasoning

sparse representation

interpretability

controllability

semantic features

Innovation

Methods, ideas, or system contributions that make the work stand out.

sparse transcoders

latent reasoning

interpretable AI