π€ AI Summary
This work addresses the problem of learning generalizable lifted STRIPS+ action models from action traces under weak supervision, where action parameters are partially known and state predicates are either fully or partially unobservable. The paper introduces a unified theoretical learnability condition and algorithmic framework that integrates formal logical modeling, lifted representation learning, and trajectory analysis to handle three distinct assumptions about state observability. The proposed algorithms accurately recover equivalent action models even under severe information scarcity. Experimental results demonstrate the effectiveness and robustness of the approach, marking the first systematic solution to lifted action model learning in partially observable settings.
π Abstract
It has been recently shown that lifted STRIPS models can be learned correctly and efficiently from action traces alone; i.e., applicable action sequences from a hidden STRIPS model. The result is remarkable because the states are not assumed to be observable at all, and yet it is not practical enough as STRIPS actions include arguments that are not needed for selecting the actions. This shortcoming has been addressed by assuming that the action traces come instead from a hidden STRIPS+ model where some action arguments are implicit in the hidden action preconditions. A limitation of this approach, however, is that it assumes that the states are fully observable. In this work, we relax these restrictions and consider the problem of learning STRIPS+ action domains from traces in a more general context where the traces carry partial information about both actions and states. In particular, we formulate algorithms and completeness results for three general cases, all of which assume full observability of selected action arguments. In the first case, no observability of the state is assumed; in the second case, full observability of some state predicates is assumed, and in the third case, local observability of some state predicates is assumed instead. Given a STRIPS+ domain, these results characterize the conditions under which an equivalent domain can be learned from traces. Experimental results are reported.