🤖 AI Summary
Large language models (LLMs) frequently generate unauthorized responses under role-based access control (RBAC), exhibiting unreliable judgment of authorization boundaries—especially at fine-grained table- or column-level granularity.
Method: We propose the first role-conditioned refusal evaluation framework tailored to fine-grained RBAC policies, built upon an enhanced security benchmark dataset derived from Spider and BIRD. Our approach integrates zero-shot/few-shot prompting, a generator-verifier two-stage architecture, LoRA fine-tuning, and native PostgreSQL role policy enforcement to ensure policy consistency.
Contribution/Results: Experiments show that the two-stage verification significantly improves refusal accuracy; LoRA fine-tuning achieves the highest execution accuracy; and policy complexity negatively correlates with model compliance. This work is the first to systematically characterize the trade-off between secure refusal and functional utility, establishing a reproducible evaluation benchmark and actionable technical pathway for trustworthy access control in LLMs.
📝 Abstract
Access control is a cornerstone of secure computing, yet large language models often blur role boundaries by producing unrestricted responses. We study role-conditioned refusals, focusing on the LLM's ability to adhere to access control policies by answering when authorized and refusing when not. To evaluate this behavior, we created a novel dataset that extends the Spider and BIRD text-to-SQL datasets, both of which have been modified with realistic PostgreSQL role-based policies at the table and column levels. We compare three designs: (i) zero or few-shot prompting, (ii) a two-step generator-verifier pipeline that checks SQL against policy, and (iii) LoRA fine-tuned models that learn permission awareness directly. Across multiple model families, explicit verification (the two-step framework) improves refusal precision and lowers false permits. At the same time, fine-tuning achieves a stronger balance between safety and utility (i.e., when considering execution accuracy). Longer and more complex policies consistently reduce the reliability of all systems. We release RBAC-augmented datasets and code.