🤖 AI Summary
Privacy-preserving techniques incur prohibitively high computational and communication overhead in large language model (LLM) scenarios, severely hindering practical deployment. This paper proposes a hardware-software-algorithm co-design framework that systematically optimizes foundational privacy primitives—including secure multi-party computation (MPC), zero-knowledge proofs (ZKPs), and fully homomorphic encryption (FHE)—to scale efficiently to LLM-sized training and inference. Our core contribution lies in rearchitecting the computational paradigms and communication protocols of these primitives specifically for LLM workloads, enabling cross-layer joint optimization. Experimental evaluation demonstrates 10–100× speedup over state-of-the-art baselines across diverse applications: deep neural network intellectual property protection, ethically governed LLM access control, and private Transformer inference. The framework significantly improves both the practicality and deployability of privacy-preserving machine learning systems.
📝 Abstract
Privacy-preserving technologies have introduced a paradigm shift that allows for realizable secure computing in real-world systems. The significant barrier to the practical adoption of these primitives is the computational and communication overhead that is incurred when applied at scale. In this paper, we present an overview of our efforts to bridge the gap between this overhead and practicality for privacy-preserving learning systems using multi-party computation (MPC), zero-knowledge proofs (ZKPs), and fully homomorphic encryption (FHE). Through meticulous hardware/software/algorithm co-design, we show progress towards enabling LLM-scale applications in privacy-preserving settings. We demonstrate the efficacy of our solutions in several contexts, including DNN IP ownership, ethical LLM usage enforcement, and transformer inference.