🤖 AI Summary
In video game agents, state representations often fail to capture action-relevant causal factors, particularly in continuous action spaces. Method: This paper proposes a supervised contrastive imitation learning framework tailored for continuous actions. Its core innovation is integrating the Supervised Contrastive (SupCon) loss into imitation learning to jointly optimize state representation and action prediction, thereby eliminating reliance on discrete action assumptions. The method explicitly models causal relationships between observations and expert actions, enhancing representation discriminability and cross-environment generalization. Results: Experiments on Astro Bot, Returnal, and multiple Atari games demonstrate significant improvements in representation quality over baselines, accelerated training convergence, and superior transfer performance on unseen tasks.
📝 Abstract
This paper introduces a novel application of Supervised Contrastive Learning (SupCon) to Imitation Learning (IL), with a focus on learning more effective state representations for agents in video game environments. The goal is to obtain latent representations of the observations that capture better the action-relevant factors, thereby modeling better the cause-effect relationship from the observations that are mapped to the actions performed by the demonstrator, for example, the player jumps whenever an obstacle appears ahead. We propose an approach to integrate the SupCon loss with continuous output spaces, enabling SupCon to operate without constraints regarding the type of actions of the environment. Experiments on the 3D games Astro Bot and Returnal, and multiple 2D Atari games show improved representation quality, faster learning convergence, and better generalization compared to baseline models trained only with supervised action prediction loss functions.