An Early Experience with Confidential Computing Architecture for On-Device Model Protection

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying machine learning models on edge devices faces a critical trade-off between high model-stealing risks and the performance overhead and weak protection guarantees of existing Trusted Execution Environment (TEE) architectures. Method: This paper presents the first systematic evaluation of Arm’s Confidential Computing Architecture (CCA) for privacy-performance trade-offs in edge ML inference. We propose a lightweight trusted inference framework leveraging CCA, integrating encrypted model loading, secure execution, and memory isolation to mitigate membership inference attacks. Contribution/Results: Our framework reduces attack success rates by 8.3% while introducing at most 22% inference overhead across diverse tasks—including image classification, speech recognition, and conversational assistants—without compromising model confidentiality. All code and evaluation methodologies are publicly released, offering a practical pathway toward confidential AI deployment on resource-constrained edge platforms.

Technology Category

Application Category

📝 Abstract
Deploying machine learning (ML) models on user devices can improve privacy (by keeping data local) and reduce inference latency. Trusted Execution Environments (TEEs) are a practical solution for protecting proprietary models, yet existing TEE solutions have architectural constraints that hinder on-device model deployment. Arm Confidential Computing Architecture (CCA), a new Arm extension, addresses several of these limitations and shows promise as a secure platform for on-device ML. In this paper, we evaluate the performance-privacy trade-offs of deploying models within CCA, highlighting its potential to enable confidential and efficient ML applications. Our evaluations show that CCA can achieve an overhead of, at most, 22% in running models of different sizes and applications, including image classification, voice recognition, and chat assistants. This performance overhead comes with privacy benefits; for example, our framework can successfully protect the model against membership inference attack by an 8.3% reduction in the adversary's success rate. To support further research and early adoption, we make our code and methodology publicly available.
Problem

Research questions and friction points this paper is trying to address.

Protecting proprietary ML models on user devices
Overcoming architectural constraints of existing TEE solutions
Balancing performance and privacy in confidential ML deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Arm Confidential Computing Architecture (CCA)
Evaluates performance-privacy trade-offs
Protects models against inference attacks
🔎 Similar Papers
No similar papers found.