🤖 AI Summary
Efficient and interpretable decision-making under temporal uncertainty in partially observable Markov decision processes (POMDPs) remains challenging.
Method: We propose an event calculus-driven linear temporal logic (LTL) framework that automatically learns temporally persistent symbolic macro-actions from few belief-action trajectories, integrating event calculus, LTL, and inductive logic programming (ILP) without handcrafted heuristics. The learned macro-actions are embedded into Monte Carlo tree search (MCTS) to drastically reduce inference overhead.
Results: Evaluated on Pocman and Rocksample benchmarks, our approach achieves superior expressivity and temporal adaptability compared to static heuristics, while maintaining robustness. It enables end-to-end learning using only the POMDP transition model—requiring no reward or observation models—and simultaneously ensures interpretability, generalization, and computational efficiency.
📝 Abstract
This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate emph{persistent} (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.