Membrane: A Cryptographic Access Control System for Data Lakes

📅 2025-09-10

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address unauthorized access risks to sensitive data in data lakes, this paper proposes Membrane—a novel system that jointly designs static encryption with an SQL-aware encryption protocol. Leveraging a compute-storage separation architecture, Membrane enforces data-dependent, fine-grained access control views. Its key innovation is decrypting restricted views only once during session initialization; all subsequent queries execute entirely in plaintext, thus balancing strong security with analytical flexibility. At the storage layer, Membrane employs hardware-accelerated block ciphers and symmetric-key cryptography to ensure robust at-rest encryption, while its SQL-aware protocol enables dynamic enforcement of column- and row-level access policies. Experimental results show an initial query latency overhead of approximately 20×; however, amortized query performance approaches that of unencrypted baselines. Under stringent security constraints, Membrane achieves a low-overhead equilibrium between confidentiality and usability.

Technology Category

Application Category

📝 Abstract

Organizations use data lakes to store and analyze sensitive data. But hackers may compromise data lake storage to bypass access controls and access sensitive data. To address this, we propose Membrane, a system that (1) cryptographically enforces data-dependent access control views over a data lake, (2) without restricting the analytical queries data scientists can run. We observe that data lakes, unlike DBMSes, disaggregate computation and storage into separate trust domains, making at-rest encryption sufficient to defend against remote attackers targeting data lake storage, even when running analytical queries in plaintext. This leads to a new system design for Membrane that combines encryption at rest with SQL-aware encryption. Using block ciphers, a fast symmetric-key primitive with hardware acceleration in CPUs, we develop a new SQL-aware encryption protocol well-suited to at-rest encryption. Membrane adds overhead only at the start of an interactive session due to decrypting views, delaying the first query result by up to $approx 20 imes$; subsequent queries process decrypted data in plaintext, resulting in low amortized overhead.

Problem

Research questions and friction points this paper is trying to address.

Enforcing cryptographic access control for data lake security

Protecting sensitive data from storage compromise attacks

Enabling analytical queries without restricting data scientist operations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cryptographic access control for data lakes

SQL-aware encryption with block ciphers

Minimal overhead after initial session setup

🔎 Similar Papers

No similar papers found.