🤖 AI Summary
Designing arithmetic units—especially multipliers—faces challenges due to the vast design space and the difficulty of jointly optimizing power, area, and input distribution adaptivity using conventional manual or heuristic methods. To address this, we propose a two-stage framework integrating a Transformer-based surrogate model with network inversion. First, the surrogate model achieves high-accuracy, low-sample prediction of hardware metrics (power and area) via self-supervised pretraining followed by supervised fine-tuning. Second, network inversion performs inverse search to identify optimal operand encodings tailored to the input distribution of specific AI workloads. Our method automatically discovers novel, low-power encodings that outperform standard two’s-complement representation, reducing switching activity by up to 18% on typical AI workloads. It significantly improves sample efficiency and convergence speed, and demonstrates generalizability across diverse circuits, including finite-state machines.
📝 Abstract
As AI workloads proliferate, optimizing arithmetic units is becoming increasingly important to reduce the footprint of digital systems. Conventional design flows, which often rely on manual or heuristics-based optimization, are limited in their ability to thoroughly explore the vast design space. In this paper, we introduce GENIAL, a machine learning-based framework for the automatic generation and optimization of arithmetic units, more specifically multipliers.
At the core of GENIAL is a Transformer-based surrogate model trained in two stages, involving self-supervised pretraining followed by supervised finetuning, to robustly forecast key hardware metrics such as power and area from abstracted design representations. By inverting the surrogate model, GENIAL efficiently searches for new operand encodings that directly minimize power consumption in arithmetic units for specific input data distributions. Extensive experiments on large datasets demonstrate that GENIAL is consistently more sample efficient than other methods, and converges faster towards optimized designs. This enables to deploy a high-effort logic synthesis optimization flow in the loop, improving the accuracy of the surrogate model. Notably, GENIAL automatically discovers encodings that achieve up to 18% switching activity savings within multipliers on representative AI workloads compared with the conventional two's complement. We also demonstrate the versatility of our approach by achieving significant improvements on Finite State Machines, highlighting GENIAL's applicability for a wide spectrum of logic functions. Together, these advances mark a significant step toward automated Quality-of-Results-optimized combinational circuit generation for digital systems.