🤖 AI Summary
This work addresses the limitations of large language models constrained by fixed context windows and the challenges posed by existing recursive language models (RLMs), which rely on unstructured code generation that is difficult to verify or analyze. The authors propose λ-RLM, a novel framework that integrates typed lambda calculus into recursive reasoning for the first time, decomposing problems into bounded leaf subproblems and invoking neural inference only within these subproblems. By leveraging a pre-verified combinator library, λ-RLM enforces structured control flow and provides formal guarantees—including termination, upper bounds on computational cost, and controllable accuracy scaling—along with an optimal partitioning strategy. Evaluated across four long-context tasks and nine base models (36 experiments total), λ-RLM outperforms standard RLMs in 29 cases, achieving up to a 21.9-point average accuracy gain and up to a 4.1× reduction in latency.
📝 Abstract
LLMs are increasingly used as general-purpose reasoners, but long inputs remain bottlenecked by a fixed context window. Recursive Language Models (RLMs) address this by externalising the prompt and recursively solving subproblems. Yet existing RLMs depend on an open-ended read-eval-print loop (REPL) in which the model generates arbitrary control code, making execution difficult to verify, predict, and analyse.
We introduce $λ$-RLM, a framework for long-context reasoning that replaces free-form recursive code generation with a typed functional runtime grounded in $λ$-calculus. It executes a compact library of pre-verified combinators and uses neural inference only on bounded leaf subproblems, turning recursive reasoning into a structured functional program with explicit control flow. We show that $λ$-RLM admits formal guarantees absent from standard RLMs, including termination, closed-form cost bounds, controlled accuracy scaling with recursion depth, and an optimal partition rule under a simple cost model. Empirically, across four long-context reasoning tasks and nine base models, $λ$-RLM outperforms standard RLM in 29 of 36 model-task comparisons, improves average accuracy by up to +21.9 points across model tiers, and reduces latency by up to 4.1x. These results show that typed symbolic control yields a more reliable and efficient foundation for long-context reasoning than open-ended recursive code generation. The complete implementation of $λ$-RLM, is open-sourced for the community at: https://github.com/lambda-calculus-LLM/lambda-RLM.