🤖 AI Summary
Existing robotic control methods struggle to generalize across platforms with significantly different morphologies, typically requiring retraining or fine-tuning for each new system. This work proposes RACAS—a multi-agent collaborative architecture leveraging large language models (LLMs) and vision-language models (VLMs)—that enables cross-platform closed-loop control without modifying code, models, or reward functions. Given only natural language descriptions of the robot, its action space, and task instructions, RACAS orchestrates perception, decision-making, and memory through three natural language–interfaced modules: Monitor, Controller, and Memory Curator. The system successfully executes tasks on three heterogeneous platforms—wheeled ground robots, multi-joint manipulators, and underwater robots—demonstrating strong generalization and effectiveness across diverse robotic embodiments.
📝 Abstract
Many robotic platforms expose an API through which external software can command their actuators and read their sensors. However, transitioning from these low-level interfaces to high-level autonomous behaviour requires a complicated pipeline, whose components demand distinct areas of expertise. Existing approaches to bridging this gap either require retraining for every new embodiment or have only been validated across structurally similar platforms. We introduce RACAS (Robot-Agnostic Control via Agentic Systems), a cooperative agentic architecture in which three LLM/VLM-based modules (Monitors, a Controller, and a Memory Curator) communicate exclusively through natural language to provide closed-loop robot control. RACAS requires only a natural language description of the robot, a definition of available actions, and a task specification; no source code, model weights, or reward functions need to be modified to move between platforms. We evaluate RACAS on several tasks using a wheeled ground robot, a recently published novel multi-jointed robotic limb, and an underwater vehicle. RACAS consistently solved all assigned tasks across these radically different platforms, demonstrating the potential of agentic AI to substantially reduce the barrier to prototyping robotic solutions.