LLM-as-RNN: A Recurrent Language Model for Memory Updates and Sequence Prediction

📅 2026-01-19

📈 Citations: 1

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the limitation of large language models (LLMs) in lacking updatable memory during inference, which hinders their ability to dynamically correct errors. The authors propose the LLM-as-RNN framework, which— for the first time—recursively employs a frozen LLM by constructing a natural-language memory state through structured system prompts and iteratively rewriting this memory based on feedback at each step, thereby enabling online learning without any parameter updates. This approach simultaneously enhances interpretability and performance, achieving an average accuracy improvement of 6.5% across three sequential prediction tasks in healthcare, meteorology, and finance. It significantly outperforms baseline methods including zero-shot prompting, full-history prompting, and MemPrompt, while also producing human-readable learning trajectories that elucidate the model’s reasoning process.

Technology Category

Application Category

📝 Abstract

Large language models are strong sequence predictors, yet standard inference relies on immutable context histories. After making an error at generation step t, the model lacks an updatable memory mechanism that improves predictions for step t+1. We propose LLM-as-RNN, an inference-only framework that turns a frozen LLM into a recurrent predictor by representing its hidden state as natural-language memory. This state, implemented as a structured system-prompt summary, is updated at each timestep via feedback-driven text rewrites, enabling learning without parameter updates. Under a fixed token budget, LLM-as-RNN corrects errors and retains task-relevant patterns, effectively performing online learning through language. We evaluate the method on three sequential benchmarks in healthcare, meteorology, and finance across Llama, Gemma, and GPT model families. LLM-as-RNN significantly outperforms zero-shot, full-history, and MemPrompt baselines, improving predictive accuracy by 6.5% on average, while producing interpretable, human-readable learning traces absent in standard context accumulation.

Problem

Research questions and friction points this paper is trying to address.

language models

memory updates

sequence prediction

online learning

recurrent inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-as-RNN

recurrent inference

natural-language memory