Test-Time Learning with an Evolving Library

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

How can large language models continuously accumulate and evolve knowledge during inference without updating model parameters or relying on external supervision? This work proposes EvoLib, a framework that constructs a dynamically evolving shared knowledge repository by automatically extracting modular skills and reflective insights from the model’s own reasoning trajectories. Through weighted integration and abstraction mechanisms, EvoLib generalizes instance-level knowledge into reusable, transferable capabilities. To the best of our knowledge, this is the first approach to achieve test-time knowledge evolution without any parameter updates. Empirical results demonstrate that EvoLib significantly outperforms existing test-time learning methods on mathematical reasoning, code generation, and multi-turn agent tasks—all without requiring ground-truth feedback.

📝 Abstract

We introduce EvoLib, a test-time learning framework that enables large language models to accumulate, reuse, and evolve knowledge across problem instances without parameter updates or external supervision. Instead of adapting model parameters, our approach maintains a shared library of knowledge abstractions, including modular skills and reflective insights, automatically extracted from the model's own inference trajectories. To support continual improvement, we introduce a principled weighting and consolidation mechanism that jointly optimizes for immediate utility and long-term value. This allows simple, instance-specific abstractions to evolve into more general and reusable ones over time. Across challenging benchmarks in mathematical reasoning, code generation, and multi-turn agentic environments, EvoLib improves substantially over the top test-time scaling and learning methods without ground-truth feedback.

Problem

Research questions and friction points this paper is trying to address.

test-time learning

knowledge evolution

large language models

parameter-free adaptation

knowledge reuse

Innovation

Methods, ideas, or system contributions that make the work stand out.

test-time learning

knowledge evolution

modular skills