Just-in-time and distributed task representations in language models

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work investigates when and how “transferable task representations”—vectorial encodings of task context that can be recovered in another model instance without full prompting—emerge and evolve within language models. Methodologically, we track the contextual recovery capability and task decodability of task-relevant vectors in hidden states across layers and positions. We find these representations exhibit strong spatiotemporal locality: they activate only at specific token positions and encode semantically fine-grained, minimal task units. Their evolution is non-monotonic and sporadic, relying on local, immediate computation rather than global accumulation. Key contributions include revealing the *instantaneous* mechanism underlying online task adaptation in LMs; empirically, the emergence of transferable task representations tightly correlates with performance gains, and despite strong sequential locality, they support distributed processing of complex, long-horizon tasks.

Technology Category

Application Category

📝 Abstract

Many of language models'impressive capabilities originate from their in-context learning: based on instructions or examples, they can infer and perform new tasks without weight updates. In this work, we investigate emph{when} representations for new tasks are formed in language models, and emph{how} these representations change over the course of context. We focus on''transferrable''task representations -- vector representations that can restore task context in another instance of the model, even without the full prompt. We show that these representations evolve in non-monotonic and sporadic ways, and are distinct from a more inert representation of high-level task categories that persists throughout the context. Specifically, models often condense multiple evidence into these transferrable task representations, which align well with the performance improvement based on more examples in the context. However, this accrual process exhibits strong locality along the sequence dimension, coming online only at certain tokens -- despite task identity being reliably decodable throughout the context. Moreover, these local but transferrable task representations tend to capture minimal''task scopes'', such as a semantically-independent subtask, and models rely on more temporally-distributed representations to support longer and composite tasks. This two-fold locality (temporal and semantic) underscores a kind of just-in-time computational process underlying language models'ability to adapt to new evidence and learn new tasks on the fly.

Problem

Research questions and friction points this paper is trying to address.

Investigating when task representations form in language models during context processing

Analyzing how transferable task representations evolve non-monotonically across sequences

Examining temporal and semantic locality in models' just-in-time computational processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transferable task representations restore contexts

Evidence condenses with more examples provided

Locally distributed representations capture minimal scopes

🔎 Similar Papers

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling