🤖 AI Summary
To address the inefficiency of general-purpose microprocessors in machine learning (ML) tasks under stringent area and power constraints inherent to printed electronics, this paper proposes a ML-oriented customized printed microprocessor design methodology. The approach involves eliminating redundant logic, deeply integrating reconfigurable SIMD multiply-accumulate (MAC) units, and adapting the Zero-Riscy open-source core to enable algorithm-architecture co-optimization. Its key contribution is the first printed-electronics-compatible ML customization paradigm, featuring a configurable SIMD accelerator supporting four precision levels. Silicon validation demonstrates 22.2% area reduction, 23.6% power savings, and 33.8% speedup over the baseline design, while maintaining lossless inference accuracy across six ML models. Compared to state-of-the-art printed processors, it achieves significant performance gains under controllable accuracy degradation.
📝 Abstract
Printed electronics have gained significant traction in recent years, presenting a viable path to integrating computing into everyday items, from disposable products to low-cost healthcare. However, the adoption of computing in these domains is hindered by strict area and power constraints, limiting the effectiveness of general-purpose microprocessors. This paper proposes a bespoke microprocessor design approach to address these challenges, by tailoring the design to specific applications and eliminating unnecessary logic. Targeting machine learning applications, we further optimize core operations by integrating a SIMD MAC unit supporting 4 precision configurations that boost the efficiency of microprocessors. Our evaluation across 6 ML models and the large-scale Zero-Riscy core, shows that our methodology can achieve improvements of 22.2%, 23.6%, and 33.79% in area, power, and speed, respectively, without compromising accuracy. Against state-of-the-art printed processors, our approach can still offer significant speedups, but along with some accuracy degradation. This work explores how such trade-offs can enable low-power printed microprocessors for diverse ML applications.