🤖 AI Summary
This work addresses the dynamic reasoning challenges in temporal knowledge graph question answering, particularly those arising from multi-hop dependencies and complex temporal constraints. To this end, the authors propose Temp-R1, the first end-to-end autonomous agent for temporal reasoning. Built upon an 8B-parameter large language model, Temp-R1 introduces a reverse-curriculum reinforcement learning mechanism—training from hard to easy examples to suppress shortcut learning—and designs an expanded action space that integrates internal reasoning with external tool invocation. This framework establishes a novel paradigm for autonomous temporal reasoning. Evaluated on the MultiTQ and TimelineKGQA benchmarks, Temp-R1 achieves state-of-the-art performance, improving accuracy on complex questions by 19.8% over strong baselines.
📝 Abstract
Temporal Knowledge Graph Question Answering (TKGQA) is inherently challenging, as it requires sophisticated reasoning over dynamic facts with multi-hop dependencies and complex temporal constraints. Existing methods rely on fixed workflows and expensive closed-source APIs, limiting flexibility and scalability. We propose Temp-R1, the first autonomous end-to-end agent for TKGQA trained through reinforcement learning. To address cognitive overload in single-action reasoning, we expand the action space with specialized internal actions alongside external action. To prevent shortcut learning on simple questions, we introduce reverse curriculum learning that trains on difficult questions first, forcing the development of sophisticated reasoning before transferring to easier cases. Our 8B-parameter Temp-R1 achieves state-of-the-art performance on MultiTQ and TimelineKGQA, improving 19.8% over strong baselines on complex questions. Our work establishes a new paradigm for autonomous temporal reasoning agents. Our code will be publicly available soon at https://github.com/zjukg/Temp-R1.