AutoTailor: Automatic and Efficient Adaptive Model Deployment for Diverse Edge Devices

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Adaptive model deployment on heterogeneous edge devices remains challenging due to cumbersome model-aware design and time-consuming hardware-aware analysis in existing SuperNet approaches. Method: This paper proposes the first end-to-end automated SuperNet deployment framework: it leverages computation-graph-guided compilation to automatically transform arbitrary user models into lightweight supernets; and integrates learning-free latency and accuracy predictors for zero-shot, low-overhead cross-hardware performance estimation and model specialization. Contributions/Results: Compared to state-of-the-art methods, our framework reduces supernet code size by 11–27×, cuts hardware tuning cost by over 11×, improves absolute accuracy by up to 15.60%, and reduces inference latency by 60.03%.

Technology Category

Application Category

📝 Abstract

On-device machine learning (ML) has become a fundamental component of emerging mobile applications. Adaptive model deployment delivers efficient inference for heterogeneous device capabilities and performance requirements through customizing neural architectures. SuperNet-based approaches offer a promising solution by generating a large number of model variants from a pre-trained ML model. However, applying SuperNet in existing frameworks suffers from tedious model-aware development and time-consuming hardware-aware profiling, which limits their practical adoption. We present AutoTailor, the first framework to enable automated, end-to-end SuperNet-based adaptive model deployment for edge devices. Unlike manual SuperNet construction, AutoTailor employs a computation graph-guided compilation approach to automatically transform user-provided ML models into SuperNets. To support efficient specialization, AutoTailor incorporates learning-free latency and accuracy predictors, enabling low-cost yet accurate performance prediction. Our extended evaluations demonstrate that AutoTailor reduces the lines of code for SuperNet construction by 11--27$ imes$, decreases hardware-aware profiling costs by at least 11$ imes$, and achieves up to 15.60% absolute accuracy improvement and 60.03% latency reduction compared to state-of-the-art approaches across diverse models and devices.

Problem

Research questions and friction points this paper is trying to address.

Automates SuperNet creation for edge device model deployment

Reduces manual coding and hardware profiling costs significantly

Improves inference accuracy and latency across diverse devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated SuperNet generation via computation graph-guided compilation

Learning-free latency and accuracy predictors for efficient specialization

End-to-end adaptive model deployment reducing code and profiling costs

🔎 Similar Papers

No similar papers found.