๐ค AI Summary
Existing safety mechanisms for large language models (LLMs) in enterprise settings lack role-aware access controlโwhile mitigating harmful outputs, they ignore fine-grained, role-based permission constraints.
Method: We propose a role-conditioned generation framework enabling context-sensitive, secure access control. Our approach introduces three modeling strategies, constructs the first dual-dataset benchmark for role-sensitive tasks, and integrates clustering-based annotation with synthetic data to train both BERT/LLM classifiers and a role-conditioned generative model.
Contribution/Results: This work is the first systematic study of role-conditioned generation for enterprise LLM security. The framework robustly distinguishes role-specific permissions across diverse organizational structures and demonstrates strong resilience against prompt injection, role-misalignment, and jailbreaking attacks. Experimental results confirm its effectiveness in enforcing granular, context-aware access policies without compromising utility.
๐ Abstract
As large language models (LLMs) are increasingly deployed in enterprise settings, controlling model behavior based on user roles becomes an essential requirement. Existing safety methods typically assume uniform access and focus on preventing harmful or toxic outputs, without addressing role-specific access constraints. In this work, we investigate whether LLMs can be fine-tuned to generate responses that reflect the access privileges associated with different organizational roles. We explore three modeling strategies: a BERT-based classifier, an LLM-based classifier, and role-conditioned generation. To evaluate these approaches, we construct two complementary datasets. The first is adapted from existing instruction-tuning corpora through clustering and role labeling, while the second is synthetically generated to reflect realistic, role-sensitive enterprise scenarios. We assess model performance across varying organizational structures and analyze robustness to prompt injection, role mismatch, and jailbreak attempts.