🤖 AI Summary
This work addresses the challenge of enabling agents to follow multi-task temporal instructions specified in Linear Temporal Logic (LTL) within sub-symbolic reinforcement learning environments, without prior knowledge of the mapping between observations and symbolic propositions. To this end, the authors propose a joint training framework that simultaneously learns a multi-task policy and a symbol grounding module, relying solely on raw observations and sparse rewards. Symbol grounding is achieved via a neural reward machine in a semi-supervised manner. This approach represents the first successful execution of LTL tasks in sub-symbolic settings without access to ground-truth symbol mappings, thereby overcoming the traditional reliance on explicit symbolic knowledge. Experimental results demonstrate that, in vision-based environments, the method matches the performance of approaches using oracle symbol grounding and significantly outperforms existing sub-symbolic reinforcement learning methods.
📝 Abstract
In this work we address the problem of training a Reinforcement Learning agent to follow multiple temporally-extended instructions expressed in Linear Temporal Logic in sub-symbolic environments. Previous multi-task work has mostly relied on knowledge of the mapping between raw observations and symbols appearing in the formulae. We drop this unrealistic assumption by jointly training a multi-task policy and a symbol grounder with the same experience. The symbol grounder is trained only from raw observations and sparse rewards via Neural Reward Machines in a semi-supervised fashion. Experiments on vision-based environments show that our method achieves performance comparable to using the true symbol grounding and significantly outperforms state-of-the-art methods for sub-symbolic environments.