Learning Symbolic Persistent Macro-Actions for POMDP Solving Over Time

📅 2025-05-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Efficient and interpretable decision-making under temporal uncertainty in partially observable Markov decision processes (POMDPs) remains challenging. Method: We propose an event calculus-driven linear temporal logic (LTL) framework that automatically learns temporally persistent symbolic macro-actions from few belief-action trajectories, integrating event calculus, LTL, and inductive logic programming (ILP) without handcrafted heuristics. The learned macro-actions are embedded into Monte Carlo tree search (MCTS) to drastically reduce inference overhead. Results: Evaluated on Pocman and Rocksample benchmarks, our approach achieves superior expressivity and temporal adaptability compared to static heuristics, while maintaining robustness. It enables end-to-end learning using only the POMDP transition model—requiring no reward or observation models—and simultaneously ensures interpretability, generalization, and computational efficiency.

Technology Category

Application Category

📝 Abstract
This paper proposes an integration of temporal logical reasoning and Partially Observable Markov Decision Processes (POMDPs) to achieve interpretable decision-making under uncertainty with macro-actions. Our method leverages a fragment of Linear Temporal Logic (LTL) based on Event Calculus (EC) to generate emph{persistent} (i.e., constant) macro-actions, which guide Monte Carlo Tree Search (MCTS)-based POMDP solvers over a time horizon, significantly reducing inference time while ensuring robust performance. Such macro-actions are learnt via Inductive Logic Programming (ILP) from a few traces of execution (belief-action pairs), thus eliminating the need for manually designed heuristics and requiring only the specification of the POMDP transition model. In the Pocman and Rocksample benchmark scenarios, our learned macro-actions demonstrate increased expressiveness and generality when compared to time-independent heuristics, indeed offering substantial computational efficiency improvements.
Problem

Research questions and friction points this paper is trying to address.

Integrating temporal logic with POMDPs for interpretable decision-making under uncertainty
Learning persistent macro-actions via ILP to reduce POMDP inference time
Improving computational efficiency in benchmark scenarios using expressive macro-actions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LTL and POMDPs for interpretable decisions
Uses ILP to learn macro-actions from traces
Employs MCTS with persistent macro-actions for efficiency
🔎 Similar Papers
No similar papers found.