Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address task delays caused by sudden conflicts among heterogeneous robots in dynamic warehouse environments, this paper proposes a decentralized collaborative conflict-resolution framework. The method introduces a novel end-to-end LLM-to-STL semantic grounding paradigm: a vision-language model (VLM) detects conflicts, while a large language model (LLM) generates natural-language求助 requests; assistant robots then translate these requests—under Signal Temporal Logic (STL) semantics and BNF grammar constraints—into verifiable temporal constraints, and quantify task impact via mixed-integer linear programming (MILP) to support multi-candidate selection by the requester. Experimental results demonstrate a 37% reduction in total task delay compared to heuristic strategies (e.g., nearest-neighbor matching), alongside 100% syntactic correctness and executability in translating求助 semantics into formal STL constraints.

Technology Category

Application Category

📝 Abstract

Increased robot deployment, such as in warehousing, has revealed a need for seamless collaboration among heterogeneous robot teams to resolve unforeseen conflicts. To address this challenge, we propose a novel, decentralized framework for robots to request and provide help. The framework begins with robots detecting conflicts using a Vision Language Model (VLM), then reasoning over whether help is needed. If so, it crafts and broadcasts a natural language (NL) help request using a Large Language Model (LLM). Potential helper robots reason over the request and offer help (if able), along with information about impact to their current tasks. Helper reasoning is implemented via an LLM grounded in Signal Temporal Logic (STL) using a Backus-Naur Form (BNF) grammar to guarantee syntactically valid NL-to-STL translations, which are then solved as a Mixed Integer Linear Program (MILP). Finally, the requester robot chooses a helper by reasoning over impact on the overall system. We evaluate our system via experiments considering different strategies for choosing a helper, and find that a requester robot can minimize overall time impact on the system by considering multiple help offers versus simple heuristics (e.g., selecting the nearest robot to help).

Problem

Research questions and friction points this paper is trying to address.

Enabling seamless collaboration among heterogeneous robot teams

Decentralized framework for robots to request and provide help

Natural language help requests and impact-aware helper selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

VLM detects conflicts for robot teams

LLM crafts natural language help requests

STL grounded LLM ensures valid NL-to-STL translations

🔎 Similar Papers

Wonderful Team: Zero-Shot Physical Task Planning with Visual LLMs