🤖 AI Summary
To address the challenge of simultaneously achieving security, performance, and scalability for NVMe-over-Fabrics (NVMe-oF) disaggregated storage in confidential computing environments, this paper proposes a hardware-coordinated secure storage management framework. Our approach integrates confidential computing, NVIDIA BlueField-3 SmartNIC acceleration, and a streamlined IPSec design to eliminate redundant security overhead. Key contributions include: (1) a lightweight control-path enhancement and counter-leasing protocol enabling protocol-agnostic access control and freshness guarantees without modifying the NVMe-oF specification; (2) a Hazel Merkle Tree optimized for NVMe metadata, enabling efficient, low-overhead data integrity verification; and (3) holistic hardware-software co-design for end-to-end trust. Evaluated under AI training and synthetic workloads, the prototype incurs only ~2% performance overhead while sustaining line-rate throughput and robust protection against resource abuse—significantly improving the security-performance trade-off in disaggregated storage systems.
📝 Abstract
Disaggregated storage with NVMe-over-Fabrics (NVMe-oF) has emerged as the standard solution in modern data centers, achieving superior performance, resource utilization, and power efficiency. Simultaneously, confidential computing (CC) is becoming the de facto security paradigm, enforcing stronger isolation and protection for sensitive workloads. However, securing state-of-the-art storage with traditional CC methods struggles to scale and compromises performance or security. To address these issues, we introduce sNVMe-oF, a storage management system extending the NVMe-oF protocol and adhering to the CC threat model by providing confidentiality, integrity, and freshness guarantees. sNVMe-oF offers an appropriate control path and novel concepts such as counter-leasing. sNVMe-oF also optimizes data path performance by leveraging NVMe metadata, introducing a new disaggregated Hazel Merkle Tree (HMT), and avoiding redundant IPSec protections. We achieve this without modifying the NVMe-oF protocol. To prevent excessive resource usage while delivering line rate, sNVMe-oF also uses accelerators of CC-capable smart NICs. We prototype sNVMe-oF on an NVIDIA BlueField-3 and demonstrate how it can achieve as little as 2% performance degradation for synthetic patterns and AI training.