RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

📅 2026-03-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unstable inference performance of Vision-Language-Action (VLA) models when deployed at the edge, which stems from architectural heterogeneity and network fluctuations. To tackle this challenge, the authors propose RoboECC, a novel multi-factor-aware edge-cloud collaborative deployment framework tailored for VLA models. RoboECC determines the optimal model partitioning point through a model-hardware co-aware splitting strategy and dynamically adapts to bandwidth variations via a network-aware adjustment mechanism. Experimental results demonstrate that RoboECC achieves up to 3.28× acceleration across diverse VLA models, with only a modest 2.55–2.62× increase in communication overhead, effectively balancing real-time responsiveness and computational efficiency.

Technology Category

Application Category

📝 Abstract
Vision-Language-Action (VLA) models are mainstream in embodied intelligence but face high inference costs. Edge-Cloud Collaborative (ECC) deployment offers an effective fix by easing edge-device computing pressure to meet real-time needs. However, existing ECC frameworks are suboptimal for VLA models due to two challenges: (1) Diverse model structures hinder optimal ECC segmentation point identification; (2) Even if the optimal split point is determined, changes in network bandwidth can cause performance drift. To address these issues, we propose a novel ECC deployment framework for various VLA models, termed RoboECC. Specifically, we propose a model-hardware co-aware segmentation strategy to help find the optimal segmentation point for various VLA models. Moreover, we propose a network-aware deployment adjustment approach to adapt to the network fluctuations for maintaining optimal performance. Experiments demonstrate that RoboECC achieves a speedup of up to 3.28x with only 2.55x~2.62x overhead.
Problem

Research questions and friction points this paper is trying to address.

Vision-Language-Action models
Edge-Cloud Collaboration
model segmentation
network bandwidth fluctuation
embodied intelligence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Edge-Cloud Collaboration
Vision-Language-Action Models
Model-Hardware Co-aware Segmentation
Network-aware Adaptation
Embodied Intelligence
🔎 Similar Papers
No similar papers found.
Zihao Zheng
Zihao Zheng
Peking University
Machine Learning SystemEdge ComputingComputer ArchitectureEDA
H
Hangyu Cao
School of Computer Science, South China University of Technology, Guangzhou, China
Jiayu Chen
Jiayu Chen
PhD student, IFLab@PKU
Efficient Visual GenerationML system
S
Sicheng Tian
School of Artificial Intelligence, Beijing Normal University, Beijing, China
Chenyue Li
Chenyue Li
Hong Kong University of Science and Technology
AI for ScienceLarge Language Model
M
Maoliang Li
School of Computer Science, Peking University, Beijing, China
X
Xinhao Sun
School of EECS, Peking University, Beijing, China
Guojie Luo
Guojie Luo
Peking University
Electronic Design AutomationReconfigurable Architecture
X
Xiang Chen
School of Computer Science, Peking University, Beijing, China