ToolTweak: An Attack on Tool Selection in LLM-based Agents

๐Ÿ“… 2025-10-02
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work identifies a critical security vulnerability in LLM-based agents during tool selection: adversaries can iteratively optimize tool names and descriptions to systematically manipulate agent preferences, inducing erroneous selections among functionally equivalent toolsโ€”thereby compromising fairness, competitiveness, and security of tool ecosystems. We propose the first lightweight, transferable natural language attack that requires no model access, leveraging only semantic rewriting and perplexity-based filtering to achieve cross-model generalization (across both open- and closed-source LLMs). Experiments show that targeted tool selection rates increase from 20% to 81%. To counter this, we design a defense mechanism based on paraphrasing and perplexity-aware filtering, which significantly mitigates selection bias. Our work establishes a new paradigm for security evaluation and robust design of tool markets and agent systems.

Technology Category

Application Category

๐Ÿ“ Abstract
As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

Exploiting vulnerabilities in LLM-based agents' tool selection process
Manipulating tool names and descriptions to bias agent choices
Risks to fairness and security in emerging tool ecosystems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attacks tool selection via name manipulation
Automatically biases agent choices using descriptions
Defends with paraphrasing and perplexity filtering
๐Ÿ”Ž Similar Papers
No similar papers found.