ToolTweak: An Attack on Tool Selection in LLM-based Agents

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work identifies a critical security vulnerability in LLM-based agents during tool selection: adversaries can iteratively optimize tool names and descriptions to systematically manipulate agent preferences, inducing erroneous selections among functionally equivalent tools—thereby compromising fairness, competitiveness, and security of tool ecosystems. We propose the first lightweight, transferable natural language attack that requires no model access, leveraging only semantic rewriting and perplexity-based filtering to achieve cross-model generalization (across both open- and closed-source LLMs). Experiments show that targeted tool selection rates increase from 20% to 81%. To counter this, we design a defense mechanism based on paraphrasing and perplexity-aware filtering, which significantly mitigates selection bias. Our work establishes a new paradigm for security evaluation and robust design of tool markets and agent systems.

Technology Category

Application Category

📝 Abstract

As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

Exploiting vulnerabilities in LLM-based agents' tool selection process

Manipulating tool names and descriptions to bias agent choices

Risks to fairness and security in emerging tool ecosystems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Attacks tool selection via name manipulation

Automatically biases agent choices using descriptions

Defends with paraphrasing and perplexity filtering

🔎 Similar Papers

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents