Problem
Research questions and friction points this paper is trying to address.
Distributed Transformer Inference
Embedded Edge Devices
Communication Overhead
CPU-GPU Staging
Hardware Constraints
Research questions and friction points this paper is trying to address.
Methods, ideas, or system contributions that make the work stand out.