🤖 AI Summary
To address the security risk of excessive privilege grants in LLM-driven tool-calling agents deployed in sensitive services, this paper proposes the first least-privilege authorization framework tailored for LLM-based tool invocation. Methodologically, it introduces a novel technique for automatically reconstructing tool permission hierarchies, integrated with a mobile, fine-grained permission model to enable dynamic, strict, and fully automated privilege enforcement. Key contributions include: (1) a permission hierarchy reconstruction algorithm; (2) a synthetic benchmark dataset covering 10 real-world applications; and (3) empirical results demonstrating >90% reduction in redundant permissions while maintaining functionality—incuring only 1–6% additional inference latency—outperforming all baselines. This framework achieves, for the first time, an optimal trade-off between security and efficiency in LLM-powered agent systems.
📝 Abstract
Tool calling agents are an emerging paradigm in LLM deployment, with major platforms such as ChatGPT, Claude, and Gemini adding connectors and autonomous capabilities. However, the inherent unreliability of LLMs introduces fundamental security risks when these agents operate over sensitive user services. Prior approaches either rely on manually written policies that require security expertise, or place LLMs in the confinement loop, which lacks rigorous security guarantees. We present MiniScope, a framework that enables tool calling agents to operate on user accounts while confining potential damage from unreliable LLMs. MiniScope introduces a novel way to automatically and rigorously enforce least privilege principles by reconstructing permission hierarchies that reflect relationships among tool calls and combining them with a mobile-style permission model to balance security and ease of use. To evaluate MiniScope, we create a synthetic dataset derived from ten popular real-world applications, capturing the complexity of realistic agentic tasks beyond existing simplified benchmarks. Our evaluation shows that MiniScope incurs only 1-6% latency overhead compared to vanilla tool calling agents, while significantly outperforming the LLM based baseline in minimizing permissions as well as computational and operational costs.