Glossary/Tool Abuse

What is Tool Abuse in AI Agents?

Tool abuse is the attack pattern where an AI agent uses the tools it has access to — file operations, API calls, code execution, email sending, payment processing — for purposes the operator did not authorize. Where prompt injection corrupts what the model says, tool abuse corrupts what the model does. It is the agentic-AI failure mode with the largest blast radius because the consequences extend beyond the conversation into real systems.

How tool abuse happens

Three mechanisms, often chained:

  1. Prompt-injection-driven tool calls. The model is hijacked (directly or via retrieved content) into calling tools the user didn't intend. Repello's Zapier exploit chained an indirect prompt injection in an inbound email with the Gmail-write tool to exfiltrate data.

  2. Excessive agency. The model has access to more tools, with broader scopes, than the task actually requires. A customer-service agent that can also wire money is a customer-service agent that can also be social-engineered into wiring money.

  3. Confused-deputy patterns. The agent acts on behalf of the user with the user's privileges, but takes those actions in response to instructions from a third party (the document the user asked it to summarize, the email it's responding to). The agent has authority but isn't the entity that decided to use it.

Documented tool abuse incidents

Why tool abuse is the agentic blast-radius problem

A non-agentic LLM application's worst case is "the model says something bad to the user." An agentic application's worst case is "the model takes a bad action against connected systems." The set of bad actions is the union of every tool's capabilities. As tool counts grow (production agents commonly have 10-50 tools), the worst case grows accordingly.

Defending against tool abuse

The OWASP Agentic AI Top 10 codifies the relevant patterns. Practical defenses: