🤖 AI Summary
Conventional computing on edge devices suffers from low energy efficiency, while stochastic computing (SC) faces challenges from the overhead of stochastic bit stream (SBS) generation and the intrinsic unreliability of ReRAM devices. Method: This paper proposes a fully in-memory stochastic computing architecture that integrates true random number generation, in-situ SBS logic operations (AND/MAJ), and SBS–binary collaborative encoding/decoding directly within a ReRAM crossbar array—eliminating all data movement between memory and compute units. Contribution/Results: Leveraging SC’s inherent fault tolerance to mitigate ReRAM cell variability, the architecture overcomes reliability bottlenecks. Experiments demonstrate 1.39× and 2.16× throughput improvements over CMOS-based and state-of-the-art ReRAM SC designs, respectively, along with 1.15× and 2.8× energy-efficiency gains. Image processing quality degrades by only 5% on average, confirming practical viability.
📝 Abstract
As the demand for efficient, low-power computing in embedded and edge devices grows, traditional computing methods are becoming less effective for handling complex tasks. Stochastic computing (SC) offers a promising alternative by approximating complex arithmetic operations, such as addition and multiplication, using simple bitwise operations, like majority or AND, on random bit-streams. While SC operations are inherently fault-tolerant, their accuracy largely depends on the length and quality of the stochastic bit-streams (SBS). These bit-streams are typically generated by CMOS-based stochastic bit-stream generators that consume over 80% of the SC system's power and area. Current SC solutions focus on optimizing the logic gates but often neglect the high cost of moving the bit-streams between memory and processor. This work leverages the physics of emerging ReRAM devices to implement the entire SC flow in place: (1) generating low-cost true random numbers and SBSs, (2) conducting SC operations, and (3) converting SBSs back to binary. Considering the low reliability of ReRAM cells, we demonstrate how SC's robustness to errors copes with ReRAM's variability. Our evaluation shows significant improvements in throughput (1.39x, 2.16x) and energy consumption (1.15x, 2.8x) over state-of-the-art (CMOS- and ReRAM-based) solutions, respectively, with an average image quality drop of 5% across multiple SBS lengths and image processing tasks.