🤖 AI Summary
To address the substantial hardware overhead and frequent off-chip memory accesses incurred by confidentiality and integrity protection in DNN accelerators for safety-critical applications—such as autonomous driving, healthcare, and finance—this paper proposes a software-hardware co-designed secure acceleration architecture. We introduce three key innovations: (1) a bandwidth-aware encryption mechanism that adapts encryption granularity to memory bandwidth constraints; (2) an optimal tiling strategy jointly optimizing intra-layer and inter-layer partitioning to minimize data movement; and (3) a lightweight, multi-level integrity verification scheme leveraging hierarchical checksums and Merkle trees. Our approach significantly reduces memory bandwidth pressure and hardware resource utilization while preserving strong security guarantees. Evaluated on both server-grade and edge-class NPUs, it achieves ≥12% end-to-end inference speedup over baseline secure accelerators, with excellent scalability and practical deployability for high-assurance DNN inference scenarios.
📝 Abstract
Ensuring the confidentiality and integrity of DNN accelerators is paramount across various scenarios spanning autonomous driving, healthcare, and finance. However, current security approaches typically require extensive hardware resources, and incur significant off-chip memory access overheads. This paper introduces SeDA, which utilizes 1) a bandwidth-aware encryption mechanism to improve hardware resource efficiency, 2) optimal block granularity through intra-layer and inter-layer tiling patterns, and 3) a multi-level integrity verification mechanism that minimizes, or even eliminates, memory access overheads. Experimental results show that SeDA decreases performance overhead by over 12% for both server and edge neural processing units (NPUs), while ensuring robust scalability.