FlexPie: Accelerate Distributed Inference on Edge Devices with Flexible Combinatorial Optimization[Technical Report]

📅 2025-02-21

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

To address load imbalance, high communication overhead, and poor hardware heterogeneity adaptability in collaborative DNN inference on edge devices, this paper proposes a flexible combinatorial optimization framework that enables dynamic model partitioning, cross-device reallocation, and topology-aware joint computation-communication scheduling—achieving efficient heterogeneous collaboration without accuracy loss. The method integrates integer linear programming (ILP), lightweight heuristic search, and runtime adaptive scheduling, and is compatible with mainstream edge hardware and inference backends including TensorRT and TVM. Evaluated on a real-world edge cluster, our approach achieves 2.3× average inference speedup, 47% reduction in communication volume, and 61% lower end-to-end latency compared to PipeDream and SplitNN. These results demonstrate significant improvements in both inference efficiency and deployment flexibility for heterogeneous edge inference systems.

Technology Category

Application Category

📝 Abstract

The rapid advancement of deep learning has catalyzed the development of novel IoT applications, which often deploy pre-trained deep neural network (DNN) models across multiple edge devices for collaborative inference.

Problem

Research questions and friction points this paper is trying to address.

Accelerate distributed inference

Optimize edge device collaboration

Enhance DNN model deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Flexible combinatorial optimization

Accelerate distributed inference

Edge devices collaboration

🔎 Similar Papers

No similar papers found.