TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses the challenge of temporal reasoning hallucinations in large language models (LLMs) when applied to temporal knowledge graph question answering, as well as the limited autonomy and generalization caused by static prompting strategies. To overcome these limitations, the authors propose TKG-Thinker, an agent framework that enables deep temporal reasoning through multi-turn adaptive retrieval and interactive engagement with the environment. The approach integrates chain-of-thought supervised fine-tuning with multidimensional reward-based reinforcement learning to construct a dynamic reasoning architecture endowed with autonomous planning capabilities. Extensive experiments across multiple benchmark datasets, leveraging three open-source LLMs, demonstrate state-of-the-art performance, significantly enhancing reasoning reliability and generalization under complex temporal constraints.

Technology Category

Application Category

📝 Abstract

Temporal knowledge graph question answering (TKGQA) aims to answer time-sensitive questions by leveraging temporal knowledge bases. While Large Language Models (LLMs) demonstrate significant potential in TKGQA, current prompting strategies constrain their efficacy in two primary ways. First, they are prone to reasoning hallucinations under complex temporal constraints. Second, static prompting limits model autonomy and generalization, as it lack optimization through dynamic interaction with temporal knowledge graphs (TKGs) environments. To address these limitations, we propose \textbf{TKG-Thinker}, a novel agent equipped with autonomous planning and adaptive retrieval capabilities for reasoning over TKGs. Specifically, TKG-Thinker performs in-depth temporal reasoning through dynamic multi-turn interactions with TKGs via a dual-training strategy. We first apply Supervised Fine-Tuning (SFT) with chain of thought data to instill core planning capabilities, followed by a Reinforcement Learning (RL) stage that leverages multi-dimensional rewards to refine reasoning policies under intricate temporal constraints. Experimental results on benchmark datasets with three open-source LLMs show that TKG-Thinker achieves state-of-the-art performance and exhibits strong generalization across complex TKGQA settings.

Problem

Research questions and friction points this paper is trying to address.

Temporal Knowledge Graph Question Answering

Reasoning Hallucination

Static Prompting

Model Autonomy

Temporal Constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Knowledge Graph

Agent-based Reasoning

Reinforcement Learning