Universal Reasoning Model

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The performance gains of Universal Transformers (UTs) on complex reasoning tasks—such as ARC-AGI and Sudoku—remain poorly understood, hindering principled architectural design. Method: This work identifies recurrent inductive bias and strong nonlinearity—not structural complexity—as the primary drivers of UT’s reasoning capability. To operationalize these insights, we propose the Universal Reasoning Model (URM): it incorporates lightweight short convolutions for enhanced local pattern modeling, truncated backpropagation through time (BPTT) for improved training stability and long-range reasoning, and position-aware attention with adaptive depth unfolding. Contribution/Results: URM achieves 53.8% and 16.0% pass@1 on ARC-AGI 1 and ARC-AGI 2, respectively—setting new state-of-the-art results at the time. Crucially, this is the first work to systematically isolate, analyze, and empirically validate the roles of key inductive biases in UTs. By decoupling architectural components and linking them to reasoning mechanisms, URM establishes a new paradigm for designing efficient, interpretable, and reasoning-capable neural architectures.

Technology Category

Application Category

📝 Abstract
Universal transformers (UTs) have been widely used for complex reasoning tasks such as ARC-AGI and Sudoku, yet the specific sources of their performance gains remain underexplored. In this work, we systematically analyze UTs variants and show that improvements on ARC-AGI primarily arise from the recurrent inductive bias and strong nonlinear components of Transformer, rather than from elaborate architectural designs. Motivated by this finding, we propose the Universal Reasoning Model (URM), which enhances the UT with short convolution and truncated backpropagation. Our approach substantially improves reasoning performance, achieving state-of-the-art 53.8% pass@1 on ARC-AGI 1 and 16.0% pass@1 on ARC-AGI 2. Our code is avaliable at https://github.com/zitian-gao/URM.
Problem

Research questions and friction points this paper is trying to address.

Analyzes performance sources of universal transformers for reasoning tasks
Proposes enhanced model with short convolution and truncated backpropagation
Achieves state-of-the-art results on ARC-AGI benchmark datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhances Universal Transformers with short convolution
Improves reasoning via truncated backpropagation technique
Achieves state-of-the-art performance on ARC-AGI tasks