π€ AI Summary
This work addresses the inefficiency in existing edgeβcloud collaborative video analysis systems, where caching intermediate features at the whole-scene granularity leads to substantial misclassification of reusable features as invalid under non-uniform motion, resulting in wasted computation and bandwidth. To overcome this, the authors propose a motion-aware feature caching and reuse mechanism that leverages codec-level block motion vectors (MVs) to manage cache at the motion-region granularity. The approach introduces a Receptive Field Alignment Principle (RFAP) to ensure correctness during feature reuse and designs an MV-guided cache remapping strategy to maintain cross-frame consistency. Experimental results demonstrate that, across diverse visual tasks and dynamic video scenarios, the proposed method significantly reduces latency by 32.6β83.8% and saves energy by 14.9β64.0% over baseline approaches while preserving accuracy.
π Abstract
Caching and reusing intermediate features across consecutive frames is a common technique to reduce redundant computation and transmission for edge-cloud video analytics in mobile edge computation. Existing methods manage the cache in a fixed or globally shifted coordinate system, treating it as an indivisible whole. Under the non-uniform motion patterns of mobile scenes, this whole-scene granularity invalidates large portions of the cache even when most content has merely shifted spatially, wasting computation and bandwidth. The root cause is a granularity mismatch: the cache is managed per scene, yet motion varies per region. In this paper, we present FluxShard, a motion-aware edge-cloud video analytics system that uses codec-level block motion vectors (MVs) to manage feature cache reuse and recomputation at the granularity of individual motion regions. By re-indexing cached features along per-block MVs, FluxShard separates spatial displacement from content changes, recovering reusable content that whole-scene methods would otherwise discard. To ensure correct reuse under heterogeneous motion, the Receptive Field Alignment Principle (RFAP) identifies, from the input-level MV field alone, the positions that must be recomputed due to inconsistent spatial composition within receptive fields. To maintain cache coherence across frames, MV-guided cache remapping warps the entire feature cache to the current coordinate system each frame, sustaining a high reuse ratio over time. A profiling-driven dispatcher routes the remaining sparse workload between edge and cloud for lower latency. Evaluation across multiple vision tasks, dynamic video benchmarks, and network conditions shows that FluxShard reduces latency by 32.6-83.8% and energy by 14.9-64.0% over all baselines under the prescribed accuracy budget.