🤖 AI Summary
This work proposes the TART framework to address the challenge of modeling causal dependencies between resource consumption and continuous maneuvers, as well as enabling multimodal tactical decision-making for autonomous robots operating under resource constraints. TART uniquely integrates contrastive learning with mutual information maximization to capture the temporal dependencies in resource–maneuver interactions. It further employs a quantized codebook to explicitly represent recurring tactical patterns, thereby unifying discrete resource control with continuous action spaces. The framework enables the generation of context-aware, temporally coherent, and multimodal coordinated policies. Experimental results demonstrate that TART significantly outperforms existing hybrid-action baselines in both maze navigation and high-fidelity F-16 air combat simulations, achieving notable improvements in resource efficiency and maneuver adaptability.
📝 Abstract
Autonomous robotic systems should reason about resource control and its impact on subsequent maneuvers, especially when operating with limited energy budgets or restricted sensing. Learning-based control is effective in handling complex dynamics and represents the problem as a hybrid action space unifying discrete resource usage and continuous maneuvers. However, prior works on hybrid action space have not sufficiently captured the causal dependencies between resource usage and maneuvers. They have also overlooked the multi-modal nature of tactical decisions, both of which are critical in fast-evolving scenarios. In this paper, we propose TART, a Temporal Action Representation learning framework for Tactical resource control and subsequent maneuver generation. TART leverages contrastive learning based on a mutual information objective, designed to capture inherent temporal dependencies in resource-maneuver interactions. These learned representations are quantized into discrete codebook entries that condition the policy, capturing recurring tactical patterns and enabling multi-modal and temporally coherent behaviors. We evaluate TART in two domains where resource deployment is critical: (i) a maze navigation task where a limited budget of discrete actions provides enhanced mobility, and (ii) a high-fidelity air combat simulator in which an F-16 agent operates weapons and defensive systems in coordination with flight maneuvers. Across both domains, TART consistently outperforms hybrid-action baselines, demonstrating its effectiveness in leveraging limited resources and producing context-aware subsequent maneuvers.