Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization

📅 2025-01-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Open-domain timeline summarization (TLS) faces severe challenges under information overload—automatically constructing coherent, causally grounded timelines from massive, heterogeneous news streams. Method: We propose CHRONOS, the first framework leveraging causal reasoning–driven iterative self-questioning to guide LLMs in dynamically retrieving reports, modeling event causality, and generating timeline graphs. It integrates iterative retrieval-augmented generation (RAG), dynamic query generation, and structured event graph construction. Contributions/Results: (1) We introduce Open-TLS, the first open-domain TLS benchmark annotated by professional journalists; (2) CHRONOS significantly outperforms existing baselines on open-domain TLS, matching state-of-the-art performance in closed-domain settings; (3) Extensive ablation and case studies empirically validate the effectiveness and robustness of the self-questioning paradigm for real-world news temporal understanding.

Technology Category

Application Category

📝 Abstract
In the fast-changing realm of information, the capacity to construct coherent timelines from extensive event-related content has become increasingly significant and challenging. The complexity arises in aggregating related documents to build a meaningful event graph around a central topic. This paper proposes CHRONOS - Causal Headline Retrieval for Open-domain News Timeline SummarizatiOn via Iterative Self-Questioning, which offers a fresh perspective on the integration of Large Language Models (LLMs) to tackle the task of Timeline Summarization (TLS). By iteratively reflecting on how events are linked and posing new questions regarding a specific news topic to gather information online or from an offline knowledge base, LLMs produce and refresh chronological summaries based on documents retrieved in each round. Furthermore, we curate Open-TLS, a novel dataset of timelines on recent news topics authored by professional journalists to evaluate open-domain TLS where information overload makes it impossible to find comprehensive relevant documents from the web. Our experiments indicate that CHRONOS is not only adept at open-domain timeline summarization, but it also rivals the performance of existing state-of-the-art systems designed for closed-domain applications, where a related news corpus is provided for summarization.
Problem

Research questions and friction points this paper is trying to address.

Information Retrieval
Timeline Generation
Event Sequencing
Innovation

Methods, ideas, or system contributions that make the work stand out.

CHRONOS
Open-TLS
Temporal Narrative Construction
🔎 Similar Papers
No similar papers found.
Weiqi Wu
Weiqi Wu
Shanghai Jiao Tong University
Natural Language Processing
Shen Huang
Shen Huang
Director of Search, Yihaodian.com
Machine learningdata miningsearchrecommendationpersonalization
Y
Yong Jiang
Tongyi Lab, Alibaba Group
Pengjun Xie
Pengjun Xie
Alibaba Group
NLP/IR/ML
F
Fei Huang
Tongyi Lab, Alibaba Group
H
Hai Zhao
Department of Computer Science and Engineering, Shanghai Jiao Tong University; Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University; Shanghai Key Laboratory of Trusted Data Circulation and Governance in Web3