🤖 AI Summary
To address the high power consumption and poor adaptability of fixed-point multipliers to AI sparsity patterns, this paper proposes a sign-magnitude encoding–based energy-efficiency optimization method. It decouples the multiplier into a two’s-complement-to-sign-magnitude encoder and a lightweight multiplication module, leveraging the low switching activity inherent in sign-magnitude representation to significantly suppress dynamic power. The encoding scheme is end-to-end logically equivalent, interface-transparent, and EDA-flow compatible; moreover, it supports dynamic input-range adjustment for further energy savings. Integrating toggle-activity–driven design-space exploration with independent synthesis optimization, post-synthesis evaluation using standard-cell libraries shows that a 4-bit multiplier achieves up to 12.9% reduction in switching activity under normally distributed inputs (σ = 3.0), and up to 33% reduction when inputs are constrained to [−7, +7]. Combined optimization strategies yield additional energy-efficiency gains of 5–10%.
📝 Abstract
This work presents a method to maximize power-efficiency of fixed point multiplier units by decomposing them into sub-components. First, an encoder block converts the operands from a two's complement to a sign magnitude representation, followed by a multiplier module which performs the compute operation and outputs the resulting value in the original format. This allows to leverage the power-efficiency of the Sign Magnitude encoding for the multiplication. To ensure the computing format is not altered, those two components are synthesized and optimized separately. Our method leads to significant power savings for input values centered around zero, as commonly encountered in AI workloads. Under a realistic input stream with values normally distributed with a standard deviation of 3.0, post-synthesis simulations of the 4-bit multiplier design show up to 12.9% lower switching activity compared to synthesis without decomposition. Those gains are achieved while ensuring compliance into any production-ready system as the overall circuit stays logic-equivalent. With the compliance lifted and a slightly smaller input range of -7 to +7, switching activity reductions can reach up to 33%. Additionally, we demonstrate that synthesis optimization methods based on switching-activity-driven design space exploration can yield a further 5-10% improvement in power-efficiency compared to a power agnostic approach.