🤖 AI Summary
Current AI systems lack temporal awareness, hindering their ability to assess factual validity and timeliness. To address this, we propose the first benchmark for modeling factual temporal validity as a continuous probability distribution, representing the knowledge lifecycle—emergence, peak, and decay—as a skewed normal distribution on a semantically disentangled time axis. Our contributions include: (1) the first continuous-distribution formulation of temporal validity; (2) an interpretable parametric regression framework; and (3) a two-tier human-annotated benchmark comprising atomic facts and multi-sentence passages, achieving inter-annotator agreement of 84% and 89%, respectively. Experiments demonstrate that our method significantly outperforms discrete classification baselines in predicting location, scale, and skewness parameters. The approach effectively supports time-sensitive applications—including knowledge anchoring, fact checking, retrieval-augmented generation, and proactive intelligent agents—by enabling fine-grained temporal reasoning over factual knowledge.
📝 Abstract
Human cognition is deeply intertwined with a sense of time, known as Chronoception. This sense allows us to judge how long facts remain valid and when knowledge becomes outdated. Despite progress in vision, language, and motor control, AI still struggles to reason about temporal validity. We introduce Chronocept, the first benchmark to model temporal validity as a continuous probability distribution over time. Using skew-normal curves fitted along semantically decomposed temporal axes, Chronocept captures nuanced patterns of emergence, decay, and peak relevance. It includes two datasets: Benchmark I (atomic facts) and Benchmark II (multi-sentence passages). Annotations show strong inter-annotator agreement (84% and 89%). Our baselines predict curve parameters - location, scale, and skewness - enabling interpretable, generalizable learning and outperforming classification-based approaches. Chronocept fills a foundational gap in AI's temporal reasoning, supporting applications in knowledge grounding, fact-checking, retrieval-augmented generation (RAG), and proactive agents. Code and data are publicly available.