Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

📅 2025-03-28

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Linear recurrent models significantly underperform Transformers on long-context tasks such as in-context learning. To address this, we propose the first lightweight retrieval-augmented framework tailored for linear recurrent architectures. Our core innovation is the seamless integration of a dynamic context retrieval mechanism into the linear recurrence process, enabling task-adaptive information injection. The method comprises three key components: (i) key-value-based context-aware attention, (ii) an SSM-compatible state interface for recurrent state management, and (iii) a differentiable approximate nearest-neighbor retrieval module. The framework is plug-and-play and compatible with any linear recurrent backbone. Evaluated across multiple synthetic and real-world NLP benchmarks, it achieves an average accuracy improvement of 12.7% and reduces context copying error rate by 41%, substantially narrowing the performance gap with Transformer models.

Technology Category

Application Category

📝 Abstract

Recent shifts in the space of large language model (LLM) research have shown an increasing focus on novel architectures to compete with prototypical Transformer-based models that have long dominated this space. Linear recurrent models have proven to be a viable competitor due to their computational efficiency. However, such models still demonstrate a sizable gap compared to Transformers in terms of in-context learning among other tasks that require recalling information from a context. In this work, we introduce __Resona__, a simple and scalable framework for augmenting linear recurrent models with retrieval. __Resona__~augments models with the ability to integrate retrieved information from the provided input context, enabling tailored behavior to diverse task requirements. Experiments on a variety of linear recurrent models demonstrate that __Resona__-augmented models observe significant performance gains on a variety of synthetic as well as real-world natural language tasks, highlighting its ability to act as a general purpose method to improve the in-context learning and language modeling abilities of linear recurrent LLMs.

Problem

Research questions and friction points this paper is trying to address.

Enhancing context copying in linear recurrence models

Bridging performance gap with Transformer models

Improving in-context learning via retrieval augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Augments linear recurrent models with retrieval

Integrates retrieved information from input context

Improves in-context learning and language modeling

🔎 Similar Papers

Efficient Length-Generalizable Attention via Causal Retrieval for Long-Context Language Modeling