DOMAC: Differentiable Optimization for High-Speed Multipliers and Multiply-Accumulators

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Facing diminishing returns from Moore’s Law, multipliers and multiply-accumulate (MAC) units struggle to sustain improvements in performance and area efficiency. This paper proposes a process-aware differentiable architecture optimization framework. Its core innovation lies in modeling multi-level parallel Wallace/DA-type compression trees as neural-network-like structures, thereby reformulating discrete circuit optimization as a differentiable continuous optimization problem. By integrating process-dependent, differentiable timing and area models, the framework enables end-to-end automatic optimization using mainstream deep learning toolkits. Evaluated across multiple CMOS technology nodes, the method achieves, on average, 18% higher throughput and 23% smaller area compared to state-of-the-art open-source and commercial IP cores. These gains significantly enhance hardware energy efficiency for compute-intensive applications—particularly AI accelerators—without requiring manual design iteration or technology-specific heuristics.

Technology Category

Application Category

📝 Abstract
Multipliers and multiply-accumulators (MACs) are fundamental building blocks for compute-intensive applications such as artificial intelligence. With the diminishing returns of Moore's Law, optimizing multiplier performance now necessitates process-aware architectural innovations rather than relying solely on technology scaling. In this paper, we introduce DOMAC, a novel approach that employs differentiable optimization for designing multipliers and MACs at specific technology nodes. DOMAC establishes an analogy between optimizing multi-staged parallel compressor trees and training deep neural networks. Building on this insight, DOMAC reformulates the discrete optimization challenge into a continuous problem by incorporating differentiable timing and area objectives. This formulation enables us to utilize existing deep learning toolkit for highly efficient implementation of the differentiable solver. Experimental results demonstrate that DOMAC achieves significant enhancements in both performance and area efficiency compared to state-of-the-art baselines and commercial IPs in multiplier and MAC designs.
Problem

Research questions and friction points this paper is trying to address.

Optimizing multiplier and MAC performance using differentiable optimization
Reforming discrete optimization into continuous problem for efficiency
Enhancing performance and area efficiency in multiplier designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable optimization for multiplier design
Analogous compressor trees to neural networks
Continuous reformulation of discrete optimization
🔎 Similar Papers
No similar papers found.
Chenhao Xue
Chenhao Xue
School of Integrated Circuits, Peking University
AIcomputer architectureEDA
Y
Yi Ren
School of Software and Microelectronics, Peking University, Beijing, China
J
Jinwei Zhou
School of Integrated Circuits, Anhui Polytechnic University, Wuhu, China
K
Kezhi Li
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, Hong Kong S.A.R.
C
Chen Zhang
Shanghai Jiao Tong University, Shanghai, China
Yibo Lin
Yibo Lin
Assistant Professor at Peking University
Deep learningVLSI CADdesign for manufacturability
L
Lining Zhang
School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Q
Qiang Xu
National Center of Technology Innovation for EDA, Nanjing, China; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Sha Tin, Hong Kong S.A.R.
Guangyu Sun
Guangyu Sun
School of Integrated Circuits, Peking University
Computer ArchitectureDesign AutomationEmerging Memory