RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA models

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Existing edge-cloud collaborative inference methods for Vision-Language-Action (VLA) models are highly susceptible to visual noise and overlook step-level redundancy inherent in embodied tasks, thereby compromising the physical continuity of generated actions. To address these limitations, this work proposes RAPID, a novel framework that, for the first time, explicitly models step-level redundancy in embodied tasks and introduces a noise-resilient, continuity-preserving edge-cloud partitioning mechanism. By jointly leveraging environmental and task-specific features, RAPID dynamically optimizes redundancy-aware inference partitioning strategies. This approach achieves up to 1.73× inference acceleration with only 5%–7% additional overhead, significantly enhancing both computational efficiency and action coherence in embodied reasoning.

Technology Category

Application Category

📝 Abstract

Vision Language Action (VLA) models are mainstream in embodied intelligence but face high inference costs. Edge-Cloud Collaborative (ECC) inference offers an effective fix by easing edge-device computing pressure to meet real-time needs. However, existing ECC frameworks are suboptimal for VLA models due to two challenges: (1) Mainstream environment-oriented edge-cloud partitioning methods are susceptible to interference from visual noise; (2) Existing edge-cloud partitioning methods overlook the step-wise redundancy unique to embodied tasks, thereby disrupting the physical continuity of motion. To address these issues, we propose a novel ECC inference framework, termed RAPID. Specifically, we developed an implementation tailored to the proposed framework. Experiments demonstrate this achieves a speedup of up to 1.73x with only 5%~7% overhead.

Problem

Research questions and friction points this paper is trying to address.

Vision Language Action

Edge-Cloud Collaborative inference

visual noise

step-wise redundancy

physical continuity

Innovation

Methods, ideas, or system contributions that make the work stand out.

edge-cloud collaborative inference

VLA models

step-wise redundancy