🤖 AI Summary
This work proposes a natural language–based control framework to lower the barrier to multi-task drone operation. By fine-tuning the CodeT5 model on (natural language instruction, executable code) pairs generated by ChatGPT, the system automatically translates user commands into executable scripts that drive drones in a high-fidelity AirSim/Unreal Engine simulation environment. This study presents the first integration of large language models with a high-fidelity drone simulation platform, enabling complex task execution through intuitive linguistic input. Experimental results demonstrate that the approach achieves strong instruction comprehension and reliable task execution in simulation, significantly enhancing human–drone interaction efficiency and laying a foundation for real-world natural language control of autonomous aerial systems.
📝 Abstract
Benefiting from the rapid advancements in large language models (LLMs), human-drone interaction has reached unprecedented opportunities. In this paper, we propose a method that integrates a fine-tuned CodeT5 model with the Unreal Engine-based AirSim drone simulator to efficiently execute multi-task operations using natural language commands. This approach enables users to interact with simulated drones through prompts or command descriptions, allowing them to easily access and control the drone's status, significantly lowering the operational threshold. In the AirSim simulator, we can flexibly construct visually realistic dynamic environments to simulate drone applications in complex scenarios. By combining a large dataset of (natural language, program code) command-execution pairs generated by ChatGPT with developer-written drone code as training data, we fine-tune the CodeT5 to achieve automated translation from natural language to executable code for drone tasks. Experimental results demonstrate that the proposed method exhibits superior task execution efficiency and command understanding capabilities in simulated environments. In the future, we plan to extend the model functionality in a modular manner, enhancing its adaptability to complex scenarios and driving the application of drone technologies in real-world environments.