🤖 AI Summary
This work investigates how large language models (LLMs) represent and process dynamic temporal facts—i.e., knowledge that evolves over time.
Method: Leveraging attention-head-level circuit analysis, modular intervention, cross-model comparison, and time-conditioned triggering experiments, we systematically probe temporal reasoning mechanisms in LLMs.
Contribution/Results: We identify and empirically validate a class of generalizable “temporal attention heads” whose activations are selectively modulated by abstract temporal semantics—not merely numeric tokens—and exhibit robust cross-model consistency. Ablating these heads significantly degrades temporal question-answering accuracy while preserving general capabilities and non-temporal task performance. Moreover, fine-tuning head values enables precise, targeted editing of temporal knowledge. Our findings uncover a structured, circuit-level representation of temporal facts in LLMs, establishing a foundation for controllable temporal reasoning and efficient, localized knowledge updating.
📝 Abstract
While the ability of language models to elicit facts has been widely investigated, how they handle temporally changing facts remains underexplored. We discover Temporal Heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis. We confirm that these heads are present across multiple models, though their specific locations may vary, and their responses differ depending on the type of knowledge and its corresponding years. Disabling these heads degrades the model's ability to recall time-specific knowledge while maintaining its general capabilities without compromising time-invariant and question-answering performances. Moreover, the heads are activated not only numeric conditions ("In 2004") but also textual aliases ("In the year ..."), indicating that they encode a temporal dimension beyond simple numerical representation. Furthermore, we expand the potential of our findings by demonstrating how temporal knowledge can be edited by adjusting the values of these heads.