Tiny Recursive Reasoning with Mamba-2 Attention Hybrid

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the bottleneck in abstract reasoning capabilities of recursive reasoning models under parameter constraints by proposing a novel architecture that replaces the Transformer module in TRM with a Mamba-2 hybrid operator. The resulting model integrates state-space mechanisms with attention within an implicit recursive reasoning framework, maintaining a comparable parameter count (6.8M). This study presents the first validation of Mamba-2’s effectiveness in recursive reasoning, thereby expanding the design space for recursive operators and significantly improving candidate solution coverage. On the ARC-AGI-1 dataset, the model preserves pass@1 performance while achieving a 2.0% gain in pass@2 and a 4.75% improvement in pass@100, demonstrating substantially enhanced stability and diversity in generating correct solutions.

Technology Category

Application Category

📝 Abstract
Recent work on recursive reasoning models like TRM demonstrates that tiny networks (7M parameters) can achieve strong performance on abstract reasoning tasks through latent recursion -- iterative refinement in hidden representation space without emitting intermediate tokens. This raises a natural question about operator choice: Mamba-2's state space recurrence is itself a form of iterative refinement, making it a natural candidate for recursive reasoning -- but does introducing Mamba-2 into the recursive scaffold preserve reasoning capability? We investigate this by replacing the Transformer blocks in TRM with Mamba-2 hybrid operators while maintaining parameter parity (6.83M vs 6.86M parameters). On ARC-AGI-1, we find that the hybrid improves pass@2 (the official metric) by +2.0\% (45.88\% vs 43.88\%) and consistently outperforms at higher K values (+4.75\% at pass@100), whilst maintaining pass@1 parity. This suggests improved candidate coverage -- the model generates correct solutions more reliably -- with similar top-1 selection. Our results validate that Mamba-2 hybrid operators preserve reasoning capability within the recursive scaffold, establishing SSM-based operators as viable candidates in the recursive operator design space and taking a first step towards understanding the best mixing strategies for recursive reasoning.
Problem

Research questions and friction points this paper is trying to address.

recursive reasoning
Mamba-2
abstract reasoning
state space model
reasoning capability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba-2
recursive reasoning
state space model
latent recursion
hybrid architecture
🔎 Similar Papers
No similar papers found.