🤖 AI Summary
This work addresses the memory performance bottleneck arising from probabilistic computation in trustworthy artificial intelligence by proposing a unified memory analysis framework that jointly models deterministic memory accesses and stochastic sampling, treating the former as a limiting case of the latter. For the first time, it establishes a holistic perspective integrating data movement and stochasticity provisioning, revealing the operational mechanisms under entropy constraints and formulating memory-level evaluation criteria tailored for probabilistic AI. By synergistically combining analytical methods from probabilistic computing and memory architecture, and incorporating assessments of distribution programmability, parallel compatibility, and robustness to hardware non-idealities, the study clarifies the limitations of conventional architectures and outlines a scalable pathway toward probabilistic in-memory computing hardware, thereby laying a theoretical and design foundation for efficient and trustworthy AI systems.
📝 Abstract
Trustworthy artificial intelligence increasingly relies on probabilistic computation to achieve robustness, interpretability, security and privacy. In practical systems, such workloads interleave deterministic data access with repeated stochastic sampling across models, data paths and system functions, shifting performance bottlenecks from arithmetic units to memory systems that must deliver both data and randomness. Here we present a unified data-access perspective in which deterministic access is treated as a limiting case of stochastic sampling, enabling both modes to be analyzed within a common framework. This view reveals that increasing stochastic demand reduces effective data-access efficiency and can drive systems into entropy-limited operation. Based on this insight, we define memory-level evaluation criteria, including unified operation, distribution programmability, efficiency, robustness to hardware non-idealities and parallel compatibility. Using these criteria, we analyze limitations of conventional architectures and examine emerging probabilistic compute-in-memory approaches that integrate sampling with memory access, outlining pathways toward scalable hardware for trustworthy AI.