PERTINENCE: Input-based Opportunistic Neural Network Dynamic Execution

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks (DNNs) suffer from inefficient computational resource allocation across inputs of varying complexity, leading to suboptimal energy efficiency. Method: This paper proposes an input-aware dynamic model selection mechanism that integrates pretrained CNNs, Vision Transformers, and a lightweight complexity estimation module into a multi-model collaborative inference system; it further introduces a genetic algorithm to optimize input routing policies online, achieving Pareto-optimal trade-offs between accuracy and computational cost. Contribution/Results: Experiments on CIFAR-10, CIFAR-100, and TinyImageNet demonstrate that the method maintains or even improves classification accuracy while reducing floating-point operations by up to 36%, significantly enhancing inference efficiency and adaptability to input heterogeneity.

Technology Category

Application Category

📝 Abstract
Deep neural networks (DNNs) have become ubiquitous thanks to their remarkable ability to model complex patterns across various domains such as computer vision, speech recognition, robotics, etc. While large DNN models are often more accurate than simpler, lightweight models, they are also resource- and energy-hungry. Hence, it is imperative to design methods to reduce reliance on such large models without significant degradation in output accuracy. The high computational cost of these models is often necessary only for a reduced set of challenging inputs, while lighter models can handle most simple ones. Thus, carefully combining properties of existing DNN models in a dynamic, input-based way opens opportunities to improve efficiency without impacting accuracy. In this work, we introduce PERTINENCE, a novel online method designed to analyze the complexity of input features and dynamically select the most suitable model from a pre-trained set to process a given input effectively. To achieve this, we employ a genetic algorithm to explore the training space of an ML-based input dispatcher, enabling convergence towards the Pareto front in the solution space that balances overall accuracy and computational efficiency. We showcase our approach on state-of-the-art Convolutional Neural Networks (CNNs) trained on the CIFAR-10 and CIFAR-100, as well as Vision Transformers (ViTs) trained on TinyImageNet dataset. We report results showing PERTINENCE's ability to provide alternative solutions to existing state-of-the-art models in terms of trade-offs between accuracy and number of operations. By opportunistically selecting among models trained for the same task, PERTINENCE achieves better or comparable accuracy with up to 36% fewer operations.
Problem

Research questions and friction points this paper is trying to address.

Reducing reliance on large DNN models without accuracy loss
Dynamically selecting optimal models based on input complexity
Balancing computational efficiency and accuracy in neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic model selection based on input complexity
Genetic algorithm optimizes accuracy-efficiency trade-off
Reduces operations by 36% with comparable accuracy
🔎 Similar Papers
No similar papers found.