DOSA: Differentiable Model-Based One-Loop Search for DNN Accelerators

📅 2025-09-12

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Joint optimization of the hardware design space and algorithmic mapping space suffers from combinatorial explosion. This paper proposes DOSA, the first framework to formulate co-search of hardware and mapping as a differentiable optimization problem. DOSA constructs a differentiable performance model by synergistically integrating analytical modeling with learned components, enabling gradient-based continuous optimization. The framework is modular and supports end-to-end joint optimization of buffer configurations, dataflow mappings, and hardware parameters. Experiments demonstrate that, under identical sampling budgets, DOSA reduces energy-delay product by 2.80× and 12.59× over random search and Bayesian optimization, respectively; in real accelerator synthesis, it achieves a 1.82× improvement in energy-delay efficiency. The core contribution is establishing a hardware-mapping co-differentiable modeling paradigm, overcoming the limitations of conventional staged optimization approaches.

Technology Category

Application Category

📝 Abstract

In the hardware design space exploration process, it is critical to optimize both hardware parameters and algorithm-to-hardware mappings. Previous work has largely approached this simultaneous optimization problem by separately exploring the hardware design space and the mapspace - both individually large and highly nonconvex spaces - independently. The resulting combinatorial explosion has created significant difficulties for optimizers. In this paper, we introduce DOSA, which consists of differentiable performance models and a gradient descent-based optimization technique to simultaneously explore both spaces and identify high-performing design points. Experimental results demonstrate that DOSA outperforms random search and Bayesian optimization by 2.80x and 12.59x, respectively, in improving DNN model energy-delay product, given a similar number of samples. We also demonstrate the modularity and flexibility of DOSA by augmenting our analytical model with a learned model, allowing us to optimize buffer sizes and mappings of a real DNN accelerator and attain a 1.82x improvement in energy-delay product.

Problem

Research questions and friction points this paper is trying to address.

Simultaneously optimizing hardware parameters and algorithm-to-hardware mappings

Addressing combinatorial explosion in hardware design space exploration

Improving energy-delay product for DNN accelerator designs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable performance models for hardware and mappings

Gradient descent-based simultaneous space optimization

Modular analytical and learned model integration

🔎 Similar Papers

No similar papers found.