๐ค AI Summary
Traditional web search relies on static time-window filtering, which often fails to align user intent with the semantic freshness of content, frequently returning results that are temporally recent yet semantically outdated. This work proposes the first query-aware dynamic expiration prediction framework tailored for industrial-scale search systems, formulating timeliness modeling as a large language model (LLM)-driven validity reasoning task. By extracting fine-grained temporal context from documents, the framework infers query-specific โvalidity boundariesโ to determine when information becomes obsolete due to semantic shifts. The approach integrates retrieval-augmented generation (RAG), query-aware reasoning, and hallucination suppression mechanisms. Deployed in Baidu Search, it demonstrates significant improvements in result freshness and user experience, as validated by both offline evaluations and online A/B tests.
๐ Abstract
In commercial web search, aligning content freshness with user intent remains challenging due to the highly varied lifespans of information. Traditional industrial approaches rely on static time-window filtering, resulting in "one-size-fits-all" rankings where content may be chronologically recent but semantically expired. To address the limitation, we present a novel Large Language Models (LLMs)-based Query-Aware Dynamic Content Expiration Prediction Framework deployed in Baidu search, reformulating timeliness as a dynamic validity inference task. Our framework extracts fine-grained temporal contexts from documents and leverages LLMs to deduce a query-specific "validity horizon"-a semantic boundary defining when information becomes obsolete based on user intent. Integrated with robust hallucination mitigation strategies to ensure reliability, our approach has been evaluated through offline and online A/B testing on live production traffic. Results demonstrate significant improvements in search freshness and user experience metrics, validating the effectiveness of LLM-driven reasoning for solving semantic expiration at an industrial scale.