🤖 AI Summary
Active Inference (AIF) suffers from high computational and memory overhead, hindering its deployment on resource-constrained real-time or embedded systems. To address this, we propose a hardware-efficient AIF computing architecture: leveraging the pymdp framework, we construct a sparse, unified computational graph that explicitly optimizes computation flow and memory access patterns while preserving model flexibility. This work presents the first customized hardware adaptation of AIF for edge devices. Experimental evaluation demonstrates substantial efficiency gains—over 2× latency reduction and up to 35% lower peak memory footprint—without compromising inference fidelity. Our core innovation lies in mapping AIF’s Bayesian inference process onto a sparse, statically schedulable computational graph, thereby jointly optimizing algorithmic accuracy and hardware execution efficiency. The proposed architecture provides a scalable, system-level solution for deploying lightweight active agents in practical edge scenarios.
📝 Abstract
Active Inference (AIF) offers a robust framework for decision-making, yet its computational and memory demands pose challenges for deployment, especially in resource-constrained environments. This work presents a methodology that facilitates AIF's deployment by integrating pymdp's flexibility and efficiency with a unified, sparse, computational graph tailored for hardware-efficient execution. Our approach reduces latency by over 2x and memory by up to 35%, advancing the deployment of efficient AIF agents for real-time and embedded applications.