On the Robustness of Temporal Factual Knowledge in Language Models

📅 2025-02-03
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
This paper identifies a systematic deficiency in large language models (LLMs) regarding the understanding and generalization of time-sensitive facts—statements valid only on specific days, months, or years. Using a Wikidata-derived, cross-granularity (day/month/year) temporal fact benchmark, the authors employ prompt-engineering–driven controlled empirical evaluation to reveal that state-of-the-art models—including Llama-3.1-70B—exhibit significant shortcomings in both temporal fact accuracy and cross-granularity generalization. The core contributions are: (1) the first empirical demonstration that LLMs lack time-granularity generalization capability, exposing a fundamental limitation in their use as dynamic knowledge bases; and (2) the introduction of the first fine-grained temporal robustness evaluation framework, enabling comparable assessment of both pre-trained and instruction-tuned models. Results indicate that current LLMs fall short of the precision required for high-fidelity temporal knowledge services.

Technology Category

Application Category

📝 Abstract
This paper explores the temporal robustness of language models (LMs) in handling factual knowledge. While LMs can often complete simple factual statements, their ability to manage temporal facts (those valid only within specific timeframes) remains uncertain. We design a controlled experiment to test the robustness of temporal factual knowledge inside LMs, which we use to evaluate several pretrained and instruction-tuned models using prompts on popular Wikidata facts, assessing their performance across different temporal granularities (Day, Month, and Year). Our findings indicate that even very large state-of-the-art models, such as Llama-3.1-70B, vastly lack robust knowledge of temporal facts. In addition, they are incapable of generalizing their knowledge from one granularity to another. These results highlight the inherent limitations of using LMs as temporal knowledge bases. The source code and data to reproduce our experiments will be released.
Problem

Research questions and friction points this paper is trying to address.

Language Models
Temporal Facts
Time-specific Information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-aware Language Models
Temporal Knowledge Representation
Granularity of Time Understanding
🔎 Similar Papers
No similar papers found.
H
Hichem Ammar Khodja
Orange - Lannion, France , Aix Marseille Université, CNRS, LIS, UMR 7020 - Marseille, France
F
Frédéric Béchet
Aix Marseille Université, CNRS, LIS, UMR 7020 - Marseille, France , International Laboratory on Learning Systems (ILLS - IRL2020 CNRS)
Q
Quentin Brabant
Orange - Lannion, France
A
Alexis Nasr
Aix Marseille Université, CNRS, LIS, UMR 7020 - Marseille, France
Gwénolé Lecorvé
Gwénolé Lecorvé
Orange
Natural Language ProcessingLanguage ModelingQuestion Answering