🤖 AI Summary
This study addresses the challenge of temporal inconsistency in semantic retrieval from classical Chinese chronicles, where time expressions are typically implicit and non-Gregorian, often leading to erroneous results. To tackle this issue, the work introduces the first month-granularity time-keyed retrieval task tailored to the *Spring and Autumn Annals* (*Chunqiu*), along with a new benchmark dataset, ChunQiuTR, which includes temporally proximate distractors to rigorously evaluate temporal fidelity. The authors propose a calendar-aware dual-encoder model (CTD) that integrates absolute calendrical information via Fourier encoding and relative temporal offsets. Experimental results demonstrate that the proposed approach significantly outperforms strong semantic dual-encoder baselines on time-keyed retrieval, underscoring the critical role of temporal consistency in retrieval-augmented generation for historical texts.
📝 Abstract
Retrieval shapes how language models access and ground knowledge in retrieval-augmented generation (RAG). In historical research, the target is often not an arbitrary relevant passage, but the exact record for a specific regnal month, where temporal consistency matters as much as topical relevance. This is especially challenging for Classical Chinese annals, where time is expressed through terse, implicit, non-Gregorian reign phrases that must be interpreted from surrounding context, so semantically plausible evidence can still be temporally invalid. We introduce \textbf{ChunQiuTR}, a time-keyed retrieval benchmark built from the \textit{Spring and Autumn Annals} and its exegetical tradition. ChunQiuTR organizes records by month-level reign keys and includes chrono-near confounders that mirror realistic retrieval failures. We further propose \textbf{CTD} (Calendrical Temporal Dual-encoder), a time-aware dual-encoder that combines Fourier-based absolute calendrical context with relative offset biasing. Experiments show consistent gains over strong semantic dual-encoder baselines under time-keyed evaluation, supporting retrieval-time temporal consistency as a key prerequisite for faithful downstream historical RAG. Our code and datasets are available at \href{https://github.com/xbdxwyh/ChunQiuTR}{\texttt{github.com/xbdxwyh/ChunQiuTR}}.