LatentEvolve: Self-Evolving Test-Time Scaling in Latent Space

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the challenge of dynamically scaling computational resources for large language models (LLMs) during inference. We propose LatentEvolve, a self-evolving latent test-time scaling (TTS) framework inspired by the complementary learning systems hypothesis. It emulates hippocampal–neocortical interaction via a two-phase mechanism: rapid “daytime” retrieval of historical latent representations and slow “nighttime” integration and optimization—enabling continual inference capability evolution without parameter updates. Key technical contributions include latent-space experience replay, unsupervised fast-slow knowledge consolidation, and continuous prompt adaptation. Evaluated across eight benchmarks and five mainstream LLMs, LatentEvolve significantly outperforms LatentSeek and TTRL, achieving up to a 13.33% improvement. It demonstrates strong generalization across diverse tasks and model architectures.

Technology Category

Application Category

📝 Abstract

Test-time Scaling (TTS) has been demonstrated to significantly enhance the reasoning capabilities of Large Language Models (LLMs) during the inference phase without altering model parameters. However, existing TTS methods are largely independent, implying that LLMs have not yet evolved to progressively learn how to scale more effectively. With the objective of evolving LLMs to learn ``how to scale test-time computation,'' we propose LatentEvolve, a self-evolving latent TTS framework inspired by the complementary learning system (CLS) theory. Analogous to the human brain's dual system of a fast-recall hippocampus and a slow-consolidating neocortex, LatentEvolve comprises two evolutionary components: extit{daytime scaling}, which rapidly retrieves historical latent representations to better guide current LLM reasoning; and extit{nighttime scaling}, which integrates past latent optimizations in a manner akin to the human brain's consolidation of experiences during sleep. The alternation of daytime and nighttime processes facilitates a fast and slow evolution of LLM TTS, mirroring human cognitive dynamics in a fully unsupervised manner. Extensive experiments across eight benchmarks and five model backbones demonstrate that our LatentEvolve surpasses state-of-the-art TTS methods such as LatentSeek and TTRL by up to $13.33%$ and exhibits exceptional cross-domain and cross-backbone generalization.

Problem

Research questions and friction points this paper is trying to address.

Enabling LLMs to progressively learn test-time scaling

Developing self-evolving latent framework inspired by brain systems

Improving reasoning without altering model parameters through latent optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-evolving latent TTS framework inspired by CLS theory

Daytime scaling retrieves historical latent representations

Nighttime scaling integrates past latent optimizations like sleep consolidation

🔎 Similar Papers

Temporal Test-Time Adaptation with State-Space Models