🤖 AI Summary
In safe reinforcement learning (RL), manually designing cost functions to encode natural language constraints is labor-intensive and inflexible. Method: We propose the Trajectory-level Text Constraint Translator (TTCT), the first framework that directly incorporates natural language constraints as inputs—both as constraints and as training signals—eliminating the need for handcrafted cost functions. TTCT employs an end-to-end neural translation architecture integrating a text encoder, a trajectory encoder, and contrastive learning, jointly optimized within a safe RL framework. It supports zero-shot transfer to dynamically changing constraints. Contribution/Results: Experiments across diverse constraint types show an average 37% reduction in constraint violation rates. For unseen constraint categories, TTCT achieves 82% zero-shot generalization accuracy, significantly improving both policy safety and generalizability.
📝 Abstract
Safe reinforcement learning (RL) requires the agent to finish a given task while obeying specific constraints. Giving constraints in natural language form has great potential for practical scenarios due to its flexible transfer capability and accessibility. Previous safe RL methods with natural language constraints typically need to design cost functions manually for each constraint, which requires domain expertise and lacks flexibility. In this paper, we harness the dual role of text in this task, using it not only to provide constraint but also as a training signal. We introduce the Trajectory-level Textual Constraints Translator (TTCT) to replace the manually designed cost function. Our empirical results demonstrate that TTCT effectively comprehends textual constraint and trajectory, and the policies trained by TTCT can achieve a lower violation rate than the standard cost function. Extra studies are conducted to demonstrate that the TTCT has zero-shot transfer capability to adapt to constraint-shift environments.