COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the challenge of deploying large vision-language models (VLMs) in infrastructure-free multi-robot systems, where stringent real-time constraints on bandwidth, latency, and energy consumption hinder performance. To overcome these limitations, the authors propose COHORT, a ROS-based framework that enables collaborative deep neural network (DNN) inference and task execution across multiple robots. COHORT integrates offline reinforcement learning (Advantage Weighted Regression, AWR) with online multi-agent Proximal Policy Optimization (PPO) to dynamically schedule distributed DNN modules. Notably, it introduces an auction-based mechanism for the first time to allocate VLM tasks among heterogeneous robots. Experimental results demonstrate that COHORT reduces battery consumption by 15.4%, improves GPU utilization by 51.67%, and increases the frequency of meeting frame-rate and deadline constraints by 2.55× compared to baseline approaches.

Technology Category

Application Category

📝 Abstract

Large deep neural networks (DNNs), especially transformer-based and multimodal architectures, are computationally demanding and challenging to deploy on resource-constrained edge platforms like field robots. These challenges intensify in mission-critical scenarios (e.g., disaster response), where robots must collaborate under tight constraints on bandwidth, latency, and battery life, often without infrastructure or server support. To address these limitations, we present COHORT, a collaborative DNN inference and task-execution framework for multi-robot systems built on the Robotic Operating System (ROS). COHORT employs a hybrid offline-online reinforcement learning (RL) strategy to dynamically schedule and distribute DNN module execution across robots. Our key contributions are threefold: (a) Offline RL policy learning combined with Advantage-Weighted Regression (AWR), trained on auction-based task allocation data from heterogeneous DNN workloads across distributed robots, (b) Online policy adaptation via Multi-Agent PPO (MAPPO), initialized from the offline policy and fine-tuned in real time, and (c) comprehensive evaluation of COHORT on vision-language model (VLM) inference tasks such as CLIP and SAM, analyzing scalability with increasing robot/workload and robustness under . We benchmark COHORT against genetic algorithms and multiple RL baselines. Experimental results demonstrate that COHORT reduces battery consumption by 15.4% and increases GPU utilization by 51.67%, while satisfying frame-rate and deadline constraints 2.55 times of the time.

Problem

Research questions and friction points this paper is trying to address.

multi-robot systems

large DNN inference

real-time constraints

resource-constrained edge

collaborative inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Reinforcement Learning

Collaborative DNN Inference

Multi-Robot Systems