Synergy: Towards On-Body AI via Tiny AI Accelerator Collaboration on Wearables

πŸ“… 2023-12-11
πŸ“ˆ Citations: 2
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address concurrency bottlenecks in micro-AI accelerator inference on wearable devices, this paper proposes a system-level co-optimization framework. It tackles low throughput, high latency, and excessive power consumption of multiple AI applications running concurrently on resource-constrained edge devices. The framework introduces a device-agnostic unified programming interface and a resource-aware dynamic execution scheduling mechanism, enabling cross-heterogeneous-accelerator co-modeling, parallelization across multiple compute units, and real-time scheduling decisions. Its core innovation is the first tight integration of accelerator-agnostic abstraction with dynamic execution policy generation. Experimental evaluation against seven state-of-the-art baselines demonstrates an average 23.0Γ— improvement in inference throughput, a 73.9% reduction in end-to-end latency, and a 15.8% decrease in power consumption.

Technology Category

Application Category

πŸ“ Abstract
The advent of tiny artificial intelligence (AI) accelerators enables AI to run at the extreme edge, offering reduced latency, lower power cost, and improved privacy. When integrated into wearable devices, these accelerators open exciting opportunities, allowing various AI apps to run directly on the body. We present Synergy that provides AI apps with best-effort performance via system-driven holistic collaboration over AI accelerator-equipped wearables. To achieve this, Synergy provides device-agnostic programming interfaces to AI apps, giving the system visibility and controllability over the app's resource use. Then, Synergy maximizes the inference throughput of concurrent AI models by creating various execution plans for each app considering AI accelerator availability and intelligently selecting the best set of execution plans. Synergy further improves throughput by leveraging parallelization opportunities over multiple computation units. Our evaluations with 7 baselines and 8 models demonstrate that, on average, Synergy achieves a 23.0 times improvement in throughput, while reducing latency by 73.9% and power consumption by 15.8%, compared to the baselines.
Problem

Research questions and friction points this paper is trying to address.

Enabling AI on wearable devices via tiny accelerators
Optimizing AI model throughput via execution plans
Reducing latency and power in on-body AI systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Device-agnostic programming interfaces for AI apps
Intelligent execution plan selection for AI models
Parallelization over multiple computation units
πŸ”Ž Similar Papers
No similar papers found.
Taesik Gong
Taesik Gong
Assistant Professor, UNIST
On-Device AIHuman-Centered AIMachine LearningUbiquitous ComputingPersonalization
S
S. Jang
Nokia Bell Labs, Cambridge, UK
U
Utku GΓΌnay Acer
Nokia Bell Labs, Antwerp, Belgium
F
F. Kawsar
Nokia Bell Labs, Cambridge, UK and University of Glasgow, Glasgow, UK
Chulhong Min
Chulhong Min
Nokia Bell Labs, Cambridge, UK