🤖 AI Summary
This work addresses the low evaluation efficiency and high deployment barrier of language models in reward-driven reinforcement learning (RL) environments. To this end, we propose a large language model (LLM)-augmented self-completing instruction generation framework. Methodologically, we design a lightweight language adapter and an automatic instruction completion mechanism, implemented in an open-source Python library—elsciRL—accompanied by an intuitive graphical user interface that enables end-to-end translation from natural-language inputs to executable RL instructions. Our key contribution lies in deeply integrating LLMs into the RL closed loop, enabling joint optimization of instructions and actions with minimal configuration and rapid deployment. Experimental results demonstrate substantial improvements in sample efficiency and policy performance across diverse multi-task RL benchmarks. The framework establishes a novel paradigm for language-guided scientific discovery and interpretable RL research.
📝 Abstract
We present elsciRL, an open-source Python library to facilitate the application of language solutions on reinforcement learning problems. We demonstrate the potential of our software by extending the Language Adapter with Self-Completing Instruction framework defined in (Osborne, 2024) with the use of LLMs. Our approach can be re-applied to new applications with minimal setup requirements. We provide a novel GUI that allows a user to provide text input for an LLM to generate instructions which it can then self-complete. Empirical results indicate that these instructions extit{can} improve a reinforcement learning agent's performance. Therefore, we present this work to accelerate the evaluation of language solutions on reward based environments to enable new opportunities for scientific discovery.