🤖 AI Summary
Existing large language models (LLMs) are constrained by fixed-context windows, hindering the generation of long, coherent, and semantically consistent outputs. Memory-augmentation approaches either lack generality or support only rudimentary memory concatenation. To address this, we propose the first LLM-native von Neumann architecture tailored for long-output generation. Our framework features an evolvable dual-memory system comprising an instruction register and a file store, a context management unit enabling precise read/write control, and multi-agent coordination for instruction-driven, programmatic execution. Crucially, it eliminates reliance on manual prompt engineering. Evaluated on system-level codebase generation—a demanding long-output task—our method achieves state-of-the-art performance. It successfully produces entire books and other ultra-long, semantically coherent texts, significantly outperforming prior approaches in both task completion rate and semantic consistency.
📝 Abstract
Transformer-based large language models (LLMs) are constrained by the fixed context window of the underlying transformer architecture, hindering their ability to produce long and coherent outputs. Memory-augmented LLMs are a promising solution, but current approaches cannot handle long output generation tasks since they (1) only focus on reading memory and reduce its evolution to the concatenation of new memories or (2) use very specialized memories that cannot adapt to other domains. This paper presents L2MAC, the first practical LLM-based general-purpose stored-program automatic computer (von Neumann architecture) framework, an LLM-based multi-agent system, for long and consistent output generation. Its memory has two components: the instruction registry, which is populated with a prompt program to solve the user-given task, and a file store, which will contain the final and intermediate outputs. Each instruction in turn is executed by a separate LLM agent, whose context is managed by a control unit capable of precise memory reading and writing to ensure effective interaction with the file store. These components enable L2MAC to generate extensive outputs, bypassing the constraints of the finite context window while producing outputs that fulfill a complex user-specified task. We empirically demonstrate that L2MAC achieves state-of-the-art performance in generating large codebases for system design tasks, significantly outperforming other coding methods in implementing the detailed user-specified task; we show that L2MAC works for general-purpose extensive text-based tasks, such as writing an entire book; and we provide valuable insights into L2MAC's performance improvement over existing methods.