Parallelizable Neural Turing Machines

📅 2026-02-18

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the inefficiency of the original Neural Turing Machine (NTM), whose sequential operations hinder parallelization and impede training speed. To overcome this limitation, the authors propose a Parallelizable Neural Turing Machine (P-NTM) that restructures the interaction between the controller and external memory using a scan-based mechanism, while integrating autoregressive decoding to align with the parallel computation capabilities of modern deep learning frameworks. The P-NTM retains the NTM’s strong generalization to unseen sequence lengths and achieves 100% accuracy across a range of algorithmic tasks—including state tracking, memory recall, and basic arithmetic—while accelerating training by nearly an order of magnitude compared to the standard NTM. This represents the first demonstration of efficient parallel training within the NTM architecture.

Technology Category

Application Category

📝 Abstract

We introduce a parallelizable simplification of Neural Turing Machine (NTM), referred to as P-NTM, which redesigns the core operations of the original architecture to enable efficient scan-based parallel execution. We evaluate the proposed architecture on a synthetic benchmark of algorithmic problems involving state tracking, memorization, and basic arithmetic, solved via autoregressive decoding. We compare it against a revisited stable implementation of the standard NTM, as well as conventional recurrent and attention-based architectures. Results show that, despite its simplifications, the proposed model attains length generalization performance comparable to the original, learning to solve all problems, including unseen sequence lengths, with perfect accuracy. It also improves training efficiency, with parallel execution of P-NTM being up to an order of magnitude faster than the standard NTM. Ultimately, this work contributes toward the development of efficient neural architectures capable of expressing a broad class of algorithms.

Problem

Research questions and friction points this paper is trying to address.

Neural Turing Machine

parallelization

length generalization

training efficiency

algorithmic reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallelizable Neural Turing Machine

scan-based parallelism

length generalization