🤖 AI Summary
This work addresses the challenge in multi-robot systems where large language models (LLMs) struggle to model task dependencies, leading to failed collaborative execution. We propose a dependency-aware task decomposition and execution framework that—uniquely—embeds explicit directed acyclic graph (DAG)-based dependency modeling into an LLM-driven end-to-end pipeline, enabling closed-loop coordination from natural language instructions to subtask generation, robot assignment, and physical execution. The system integrates a question-answering LLM, a task decomposition module, an execution-driven module, and a vision-language object detector. Evaluated on three complex multi-robot tasks, our method outperforms all baselines: DeepSeek-R1-671B achieves the highest success rate, while Llama-3.1-8B exhibits the most stable response latency. Ablation studies confirm that DAG modeling significantly enhances collaborative robustness and success rates—particularly for smaller LLMs deployed on resource-constrained platforms.
📝 Abstract
Large Language Models (LLMs) have demonstrated promising reasoning capabilities in robotics; however, their application in multi-robot systems remains limited, particularly in handling task dependencies. This paper introduces DART-LLM, a novel framework that employs Directed Acyclic Graphs (DAGs) to model task dependencies, enabling the decomposition of natural language instructions into well-coordinated subtasks for multi-robot execution. DART-LLM comprises four key components: a Question-Answering (QA) LLM module for dependency-aware task decomposition, a Breakdown Function module for robot assignment, an Actuation module for execution, and a Vision-Language Model (VLM)-based object detector for environmental perception, achieving end-to-end task execution. Experimental results across three task complexity levels demonstrate that DART-LLM achieves state-of-the-art performance, significantly outperforming the baseline across all evaluation metrics. Among the tested models, DeepSeek-r1-671B achieves the highest success rate, whereas Llama-3.1-8B exhibits superior response time reliability. Ablation studies further confirm that explicit dependency modeling notably enhances the performance of smaller models, facilitating efficient deployment on resource-constrained platforms. Please refer to the project website https://wyd0817.github.io/project-dart-llm/ for videos and code.