Short-Term-to-Long-Term Memory Transfer for Knowledge Graphs under Partial Observability

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
This work addresses the lack of explicit modeling in existing reinforcement learning methods for transferring symbolic observations from short-term to long-term memory in partially observable environments. It formalizes this transfer process for the first time as a learnable neuro-symbolic value-based decision problem, wherein an entry-based Q-learning mechanism dynamically determines whether to store observed triples into a capacity-constrained long-term memory. The approach integrates shared parameters and cross-step temporal difference updates to handle variable-sized short-term buffers. The proposed lightweight local short-term memory architecture significantly outperforms both symbolic and neural baselines on the RoomKG benchmark, effectively retaining navigation- and query-relevant facts while proactively discarding low-value information under a 128-unit memory budget, thereby achieving interpretable and efficient memory management.
📝 Abstract
Reinforcement learning under partial observability requires deciding what information to retain, yet most memory-based approaches do not explicitly model short-term-to-long-term transfer of symbolic observations. We study this transfer process in a temporal knowledge-graph memory setting and cast it as a neuro-symbolic value-based decision problem: for each observed triple, the agent chooses whether to keep or drop it before long-term insertion. To handle variable-sized short-term buffers, we use a per-item Q-learning design with shared parameters and a practical temporal-difference update over matched items across consecutive steps. On the RoomKG benchmark at long-term memory capacity 128, learned transfer decisions outperform symbolic and neural baselines, including symbolic baselines with temporal annotations and history-based LSTM/Transformer baselines. Across transfer-policy ablations, a lightweight local short-term-only variant performs best, and step-level behavior shows that the policy keeps navigation- and query-relevant facts while discarding lower-value candidate facts, supporting explicit and interpretable memory decisions under memory constraints.
Problem

Research questions and friction points this paper is trying to address.

partial observability
memory transfer
knowledge graphs
reinforcement learning
short-term-to-long-term memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic reinforcement learning
memory transfer
partial observability
knowledge graph memory
per-item Q-learning
🔎 Similar Papers
No similar papers found.