Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

📅 2026-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the multi-objective optimization challenge of deploying AI models on ARM Cortex-M embedded processors, balancing energy efficiency, accuracy, and resource utilization. The authors propose an automated, Pareto-optimal multi-objective benchmarking framework to systematically evaluate key performance metrics across Cortex-M0+, M4, and M7 cores. Their analysis reveals an approximately linear relationship between FLOPs and inference latency and quantifies the trade-off between energy consumption and model accuracy through Pareto front characterization. The study demonstrates that the M7 excels in short inference tasks, the M4 achieves superior energy efficiency for longer workloads, and the M0+ is best suited for lightweight applications, thereby offering clear guidance for processor selection and sustainable design in embedded AI systems.

Technology Category

Application Category

📝 Abstract
This work presents a practical benchmarking framework for optimizing artificial intelligence (AI) models on ARM Cortex processors (M0+, M4, M7), focusing on energy efficiency, accuracy, and resource utilization in embedded systems. Through the design of an automated test bench, we provide a systematic approach to evaluate across key performance indicators (KPIs) and identify optimal combinations of processor and AI model. The research highlights a nearlinear correlation between floating-point operations (FLOPs) and inference time, offering a reliable metric for estimating computational demands. Using Pareto analysis, we demonstrate how to balance trade-offs between energy consumption and model accuracy, ensuring that AI applications meet performance requirements without compromising sustainability. Key findings indicate that the M7 processor is ideal for short inference cycles, while the M4 processor offers better energy efficiency for longer inference tasks. The M0+ processor, while less efficient for complex AI models, remains suitable for simpler tasks. This work provides insights for developers, guiding them to design energy-efficient AI systems that deliver high performance in realworld applications.
Problem

Research questions and friction points this paper is trying to address.

Pareto optimality
AI benchmarking
ARM Cortex processors
energy efficiency
embedded systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pareto optimization
ARM Cortex benchmarking
energy-efficient AI
embedded AI systems
FLOPs-inference correlation
Pranay Jain
Pranay Jain
Apple
Machine DesignProduct DesignSensorsInstrumentationOptics
M
Maximilian Kasper
Fraunhofer Institute for Integrated Circuits IIS, Germany
G
Göran Köber
Intelligent Embedded Systems (IES) - Lab, University of Freiburg, Germany
Axel Plinge
Axel Plinge
Fraunhofer IIS
Machine LearningArtificial IntelligenceComputer SciencePhilosophy
D
Dominik Seuß
Fraunhofer Institute for Integrated Circuits IIS, Germany; Center for Artificial Intelligence and Robotics (CAIRO), Technische Hochschule Würzburg-Schweinfurt, Germany