🤖 AI Summary
In enterprise multi-privilege settings, fine-tuning LLMs on isolated data risks violating access control due to cross-permission leakage from training data. This work proposes the Permissioned LLM paradigm, which dynamically enforces organization-level access policies at the response generation layer to guarantee strict adherence to user permission boundaries. We introduce the first formal verification framework for “relevant responses,” defining a quantifiable “access advantage” metric to rigorously assess policy compliance. Further, we design three verifiable access control mechanisms based on Parameter-Efficient Fine-Tuning (PEFT). Extensive evaluation across GPQA, RCV1, SimpleQA, and WMDP validates efficacy. We further propose two novel metrics—the Domain Distinguishability Index (DDI) and Utility Gap Index (UGI)—to reliably quantify the trade-off between access control strength and model utility.
📝 Abstract
In enterprise settings, organizational data is segregated, siloed and carefully protected by elaborate access control frameworks. These access control structures can completely break down if an LLM fine-tuned on the siloed data serves requests, for downstream tasks, from individuals with disparate access privileges. We propose Permissioned LLMs (PermLLM), a new class of LLMs that superimpose the organizational data access control structures on query responses they generate. We formalize abstractions underpinning the means to determine whether access control enforcement happens correctly over LLM query responses. Our formalism introduces the notion of a relevant response that can be used to prove whether a PermLLM mechanism has been implemented correctly. We also introduce a novel metric, called access advantage, to empirically evaluate the efficacy of a PermLLM mechanism. We introduce three novel PermLLM mechanisms that build on Parameter Efficient Fine-Tuning to achieve the desired access control. We furthermore present two instantiations of access advantage--(i) Domain Distinguishability Index (DDI) based on Membership Inference Attacks, and (ii) Utility Gap Index (UGI) based on LLM utility evaluation. We demonstrate the efficacy of our PermLLM mechanisms through extensive experiments on four public datasets (GPQA, RCV1, SimpleQA, and WMDP), in addition to evaluating the validity of DDI and UGI metrics themselves for quantifying access control in LLMs.